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Description 

[0001] This invention relates to novel compositions of matter, hereinafter called targeted multifunctional proteins, 
useful, for example, in specific binding assays, affinity purification, biocatalysis, drug targeting, imaging, immunological 

5 treatment of various oncogenic and infectious diseases, and in other contexts. More specifically, this invention relates 
to biosynthetic proteins expressed from recombinant DNA as a single polypeptide chain comprising plural regions, one 
of which has a structure similar to an antibody binding site, and an affinity for a preselected antigenic determinant, and 
another of which has a separate function, and may be biologically active, designed to bind to ions, or designed to 
facilitate immobilization of the protein. This invention also relates to the binding proteins per se, and methods for their 

10 construction. 

[0002] There are five classes of human antibodies. Each has the same basic structure (see Figure 1), or multiple 
thereof, consisting of two identical polypeptides called heavy (H) chains (molecularly weight approximately 50,000 d) 
and two identical light (L) chains (molecular weight approximately 25,000 d). Each of the five antibody classes has a 
similar set of light chains and a distinct set of heavy chains. A light chain is composed of one variable and one constant 
15 domain, while a heavy chain is composed of one variable and three or more constant domains. The combined variable 
domains of a paired light and heavy chain are known as the Fv region, or simply "Fv". The Fv determines the specificity 
of the immunoglobulin, the constant regions have other functions. 

[0003] Amino acid sequence data indicate that each variable domain comprises three hypervariable regions or loops, 
sometimes called complementarity determining regions or "CDRs" flanked by four relatively conserved framework 
20 regions or "FRs" (Kabat et. al., Sequences of Proteins of Immunological Interest [U.S. Department of Health and Human 
Services, third edition, 1983, fourth edition, 1987]). The hypervariable regions have been assumed to be responsible 
for the binding specificity of individual antibodies and to account for the diversity of binding of antibodies as a protein 
class. 

[0004] Monoclonal antibodies have been used both as diagnostic and therapeutic agents. They are routinely pro- 
25 duced according to established procedures by hybridomas generated by fusion of mouse lymphoid cells with an ap- 
propriate mouse myeloma cell line. 

[0005] The literature contains a host of references to the concept of targeting bioactive substances such as drugs, 
toxins, and enzymes to specific points in the body to destroy or locate malignant cells or to induce a localized drug or 
enzymatic effect. It has been proposed to achieve this effect by conjugating the bioactive substance to monoclonal 

30 antibodies (see, e.g., Vogel, Immunoconjugates. Antibody Conjugates in Radioimaging and Therapy of Cancer , 1987, 
N.Y., Oxford University Press; and Ghose et al. (1978) J. Natl. Cancer Inst. 61:657-676, ). However, non-human anti- 
bodies induce an immune response when injected into humans. Human monoclonal antibodies may alleviate this 
problem, but they are difficult to produce by cell fusion techniques since, among other problems, human hybridomas 
are notably unstable, and removal of immunized spleen cells from humans is not feasible. 

35 [0006] Chimeric antibodies composed of human and non-human amino acid sequences potentially have improved 
therapeutic value as they presumably would elicit less circulating human antibody against the non-human immunoglob- 
ulin sequences. Accordingly, hybrid antibody molecules have been proposed which consist of amino acid sequences 
from different mammalian sources. The chimeric antibodies designed thus far comprise variable regions from one 
mammalian source, and constant regions from human or another mammalian source (Morrison et al. (1 984) Proc. Natl. 

40 Acad. Sci. U.S.A., 8^:5851 -6855; Neuberger etal. (1984) Nature 312:604-608; Sahaganetal. (1986) J. Immunol. 137: 
1 066-1074; EPO application nos. 04302368.0, Genentech; 851 02665.3, Research Development Corporation of Japan; 
85305604.2, Stanford; P.C.T. application no. PCT/GB85/00392, Celltech Limited). 

[0007] It has been reported that binding function is localized to the variable domains of the antibody molecule located 
at the amino terminal end of both the heavy and light chains. The variable regions remain noncovalently associated 

45 (as V H V L dimers, termed Fv regions) even after proteolytic cleavage from the native antibody molecule, and retain 
much of their antigen recognition and binding capabilities (see, for example, Inbar et al., Proc. Natl. Acad. Sci. U.S.A. 
(1972) 69:2659-2662; Hochman et. al. (1973) Biochem. 12:1130-1135; and (1976) Biochem. 15:2706-2710; Sharon 
and Givol (1976) Biochem. 15:1591-1594; Rosenblatt and Haber (1978) Biochem. 17:3877-3882; Ehrlich et al. (1980) 
Biochem. 29:4091-40996). Methods of manufacturing two-chain Fv substantially free of constant region using recom- 

50 binant DNA techniques are disclosed in U.S. 4,642,334 and corresponding published specification EP 088,994. 

Summary of the Claimed Invention 

[0008] According to one aspect of the present invention there is provided a single polypeptide chain. The single 
55 polypeptide chain comprises a linking sequence which is at least 1 0 amino acid residues in length. The linking sequence 
connects a first and a second non-naturally peptide-bonded, biologically active polypeptide domain to form a single 
polypeptide chain. This single polypeptide chain comprises at least two biologically active domains connected by the 
linking sequence. The linking sequence comprises hydrophilic peptide-bonded amino acids exhibiting small and un- 
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reactive side chains but no cysteine. The hydrophiiic amino acids constitute a hydrophilic sequence which has a flexible 
unstructured configuration essentially free of secondary structure in aqueous solution. The linking sequence contains 
a plurality of glycine or serine residues and spans the distance between the C-terminal end of the first domain and the 
N-terminal end of the second domain. 

5 [0009] The linking sequence of the polypeptide chain described in the previous paragraph may contain threonine. 
The first and second non-naturally peptide-bonded, biologically active domains may be connected to the linking se- 
quence such that the linking sequence is peptide-bonded at its N-terminus to the first biologically active domain and 
at its C-terminus to the second biologically active domain. The linking sequence may further comprise plural consecutive 
copies of an amino acid sequence, for example the amino acid sequence (G lyGlyGlyGIySer) 3 and may further comprise 

10 one or a pair of amino acid sequences recognizable by a site-specific cleavage agent. 

[0010] According to another aspect of the invention there is provided a polypeptide linker. The polypeptide linker has 
a length of at least 10 amino acid residues and links two non-naturally linked polypeptide domains to form a multifunc- 
tional protein. The linker exhibits amino acids with small and unreactive side chains and comprises plural hydrophilic 
peptide-bonded amino acids constituting a hydrophilic sequence. The linker spans the distance between the C-terminal 

15 end of a first domain and the N-terminal end of a second domain. Each domain comprises a biologically active polypep- 
tide having a conformation suitable for biological activity independent of the biological activity of the other domain. 
[001 1 ] The polypeptide linker described in the above paragraph may, independently, comprise threonine, be cysteine- 
free, comprise a plurality of glycine or serine residues, comprise plural consecutive copies of an amino acid sequence, 
span a distance of at least 4 nm (40 Angstroms), comprise the amino acid sequence (G!yGlyGlyGlySer) 3 , or comprise 

20 one amino acid sequence or a pair of amino acid sequences recognizable by a site-specific cleavage agent. At least 
one of the polypeptide domains linked by the polypeptide linker described in the above paragraph may comprise an 
enzyme, a toxin, a receptor, a binding site, a biosynthetic antibody binding site, a growth factor, a eel I -differentiation 
factor, a lymphokine, a cytokine, a hormone, a remotely detectable moiety or an anti-metabolite. The first domain linked 
to the polypeptide linker described in the above paragraph may comprise a single chain binding site, and the second 

25 domain linked to the polypeptide linker described in the above paragraph may comprise an enzyme, a toxin, a receptor, 
a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation factor, a lymphokine, a cytokine, 
a hormone or an anti-metabolite. At least one of the domains linked to the linker described in the previous paragraph 
may comprise a polypeptide capable of sequestering an ion, such as preferably calmodulin, methallothionein, a frag- 
ment thereof, or an amino acid sequence rich in at least one of glutamic acid, aspartic acid, lysine and arginine. The 

30 amino acids of the linker described in the previous paragraph may assume an unstructured polypeptide configuration 
in aqueous solution. 

[0012] According to another aspect of the invention there is provided a polypeptide linker. The polypeptide linker has 
a length of at least 1 0 amino acid residues and links two non-naturally linked polypeptide domains such to form a 
functional protein. The linker exhibits amino acids with small and unreactive side chains and comprises plural hy- 
35 drophilic peptide-bonded amino acids constituting a hydrophilic sequence. The linker spans the distance between the 
C-terminal end of a first domain and the N-terminal end of a second domain. Together, the domains comprise an 
immunologically reactive binding site for a preselected antigen. 

[001 3] The two polypeptide domains linked by the linker described in the previous paragraph may be of such a nature 
as to mimic a V H and V L chain from a natural immunoglobulin. The polypeptide linker described in the above paragraph 

40 may, independently, comprise threonine, be cysteine-free, comprise a plurality of glycine or serine residues, comprise 
plural consecutive copies of an amino acid sequence, span a distance of at least 4 nm (40 Angstroms), comprise the 
amino acid sequence (GlyGlyGlyGiySer) 3 , or comprise one amino acid sequence or a pair of amino acid sequences 
recognizable by a site-specific cleavage agent. The amino acids of the linker described in the previous paragraph may 
assume an unstructured polypeptide configuration in aqueous solution. 

45 [001 4] Further aspects of this invention provide DNA encoding the polypeptide chain described above, DNA encoding 
the respective polypeptide linkers described above and a host cell transformed with and capable of expressing the 
DNA encoding the respective polypeptide linkers described above. 

[0015] Note: The instant application is a divisional application of EP 88905298.1, now granted as EP 0 318 554. As 
much of the description of the parent application is necessary to understand the subject matter claimed in the instant 
50 application in its proper context, large portions of the parent description were left intact in adapting the instant description 
to the allowed claims. It should not, however, be inferred that subject matter contained in the instant description and 
covered by claims of EP 0 318 554 is included in the subject matter instantly claimed. 

[0016] As used herein, the phrase biosynthetic antibody binding site or BABS means synthetic proteins expressed 
from DNA derived by recombinant techniques. BABS comprise biosynthetically produced sequences of amino acids 
55 defining polypeptides designed to bind with a preselected antigenic material. The definition of BABs according to the 
instant divisional application does not include the polypeptide constructs as claimed herein. The structure of these 
synthetic polypeptides is unlike that of naturally occurring antibodies, fragments thereof, e.g., Fv, or known synthetic 
polypeptides or "chimeric antibodies" in that the regions of the BABS responsible for specificity and affinity of binding, 
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(analogous to native antibody variable regions) are linked by peptide bonds, expressed from a single DNA, and may 
themselves be chimeric, e.g., may comprise amino acid sequences homologous to portions of at least two different 
antibody molecules. The BABS are biosynthetic in the sense that they are synthesized in a cellular host made to 
express a synthetic DNA, that is, a recombinant DNA made by ligation of plural, chemically synthesized oligonucle- 
5 otides, or by ligation of fragments of DNA derived from the genome of a hybridoma, mature B cell clone, or a cDNA 
library derived from such natural sources. The single polypeptide chain and linker-containing multifunctional proteins 
of the invention are properly characterized as "binding sites" in that these synthetic molecules are designed to have 
specific affinity for a preselected antigenic determinant. The polypeptides of the invention comprise structures which 
can be patterned after regions of native antibodies known to be responsible for antigen recognition. 

10 

Brief Description of the Drawing 

[0017] The foregoing and other objects of this invention, the various features thereof, as well as the invention itself, 
may be more fully understood from the following description, when read together with the accompanying drawings. 

is [001 8] Figure 1 A is a schematic representation of an intact IgG antibody molecule containing two light chains, each 
consisting of one variable and one constant domain, and two heavy chains, each consisting of one variable and three 
constant domains. Figure 1 B is a schematic drawing of the structure of Fv proteins (and DNA encoding them) illustrating 
V H and V L domains, each of which comprises four framework (FR) regions and three complementarity determining 
(CDR) regions. Boundaries of CDRs are indicated, by way of example, for monoclonal 26-10, a well known and char- 

20 acterized murine monoclonal specific for digoxin. 

[0019] Figure 2A-2E are schematic representations of some classes of reagents, each of which comprises a biosyn- 
thetic antibody binding site. 

[0020] Figure 3 discloses five amino acid sequences (heavy chains) in single letter code lined up vertically to facilitate 
understanding. Sequence 1 is the known native sequence of V H from murine monoclonal glp-4 (anti-lysozyme). Se- 

25 quence 2 is the known native sequence of V H from murine monoclonal 26-10 (anti-digoxin). Sequence 3 is a BABS 
comprising the FRs from 26-1 0 V H and the CDRs from glp-4 V H . The CDRs are identified in lower case letters; restriction 
sites in the DNA used to produce chimeric sequence 3 are also identified. Sequence 4 is the known native sequence 
of V H from human myeloma antibody NEWM. Sequence 5 is a BABS comprising the FRs from NEWM V H and the 
CDRs from glp-4 V H , i.e., illustrates a "humanized" binding site having a human framework but an affinity for lysozyme 

30 similar to murine glp-4. 

[0021] Figures 4A-4F are the synthetic nucleic acid sequences and encoded amino acid sequences of (4A) the heavy 
chain variable domain of murine anti-digoxin monoclonal 26-10; (4B) the light chain variable domain of murine anti- 
digoxin monoclonal 26-1 0; (4C) a heavy chain variable domain of a BABS comprising CDRs of glp-4 and FRs of 26-1 0; 
(4D) a light chain variable region of the same BABS; (4E) a heavy chain variable region of a BABS comprising CDRs 
35 of glp-4 and FRs of NEWM; and (4F) a light chain variable region comprising CDRs of glp-4 and FRs of NEWM. 
Delineated are FRs, CDRs, and restriction sites for endonuclease digestion, most of which were introduced during 
design of the DNA. 

[0022] Figure 5 is the nucleic acid and encoded amino acid sequence of a host DNA (V H ) designed to facilitate 
insertion of CDRs of choice. The DNA was designed to have unique 6-base sites directly flanking the CDRs so that 
40 relatively small oligonucleotides defining portions of CDRs can be readily inserted, and to have other sites to facilitate 
manipulation of the DNA to optimize binding properties in a given construct. The framework regions of the molecule 
correspond to murine FRs (Figure 4A). 

[0023] Figures 6A and 6B are multifunctional proteins (and DNA encoding them) comprising a single chain-BASS 
with the specificity of murine monoclonal 26-10, linked through a spacer to the FB fragment of protein A, here fused 
45 as a leader, and constituting a binding site for Fc. The spacer comprises the 11 C-terminal amino acids of the FB 
followed by Asp-Pro (a dilute acid cleavage site). The single chain BABS comprises sequences mimicking the V H and 
V L (6A) and the V L and V H (6B) of murine monoclonal 26-1 0. The V L in construct 6A is altered at residue 4 where valine 
replaces methionine present in the parent 26-10 sequence. These constructs contain binding sites for both Fc and 
digoxin. Their structure may be summarized as; 

50 

(6A) FB-Asp-Pro-V H -(Gly 4 -Ser) 3 -V L , 

and 

55 

(6B) FB-Asp-Pro-V L -(Gly 4 -Ser) 3 -V H , 
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where (Gly 4 -Ser) 3 is a polypeptide linker. 

[0024] In Figures 4A-4E and 6A and 6B, the amino acid sequence of the expression products start after the GAATTC 
sequences, which codes for an EcoRI splice site, translated as Glu-Phe on the drawings. 

[0025] Figure 7A is a graph of percent of maximum counts bound of radioiodinated digoxin versus concentration of 
5 binding protein adsorbed to the plate comparing the binding of native 26-10 (curve 1) and the construct of Figure 6A 

and Figure 2B renatured using two different procedures, (curves 2 and 3). Figure 7B is a graph demonstrating the 

bifunctionality of the FB-(26-10) BABS adhered to microliter plates through the specific binding of the binding site to 

the digoxin-BSA coat on the plate. Figure 7B shows the percent inhibition of 125 l-rabbit-lgG binding to the FB domain 

of the FB BABS by the addition of IgG, protein A, FB, murine lgG2a, and murine IgGI. 
w [0026] Figure 8 is a schematic representation of a model assembled DNA sequence encoding a multifunctional 

biosynthetic protein comprising a leader peptide (used to aid expression and thereafter cleaved), a binding site, a 

spacer, and an effector molecule attached as a trailer sequence, 

[0027] Figure 9A-9E are exemplary synthetic nucleic acid sequences and corresponding encoded amino acid se- 
quences of binding sites of different specificities: (A) FRs from NEWM and CDRs from 26-10 having the digoxin spe- 
15 cificity of murine monoclonal 26-10; (B) FRs from 26-10, and CDRs from G-loop-4 (glp-4) having lysozyme specificity; 
(C) FRs and CDRs from MOPC-315 having dinitrophenol (DNF) specificity; (D) FRs and CDRs from an anti-CEA 
monoclonal antibody; (E) FRs in both V H and V L and CDR 1 and CDR 3 in V H , and CDR-,, CDR 2 , and CDR 3 in V L from 
an anti-CEA monoclonal antibody; CDR 2 in V H is a CDR 2 consensus sequence found in most immunoglobulin V H 
regions. 

20 [0028] Figure 1 0A is a schematic representation of the DNA and amino acid sequence of a leader peptide (MLE) 
protein with corresponding DNA sequence and some major restriction sites. Figure 10B shows the design of an ex- 
pression plasmid used to express MLE-BABS (26-10). During construction of the gene, fusion partners were joined at 
the EcoRI site that is shown as part of the leader sequence. The pBR322 plasmid, opened at the unique Sspl and Pstl 
sites, was combined in a 3-part ligation with an Sspl to EcoRI fragment bearing the tno promoter and MLE leader and 

25 with an EcoRI to Pstl fragment carrying the BABS gene. The resulting expression vector confers tetracycline resistance 
on positive transform ants. 

[0029] Figure 11 is an SDS-polyacrylamide gel (1 5%) of the (26-1 0) BABS at progressive stages of purification. Lane 
0 shows low molecular weight standards; lane 1 is the MLE-BABS fusion protein; lane 2 is an acid digest of this material; 
lane 3 is the pooled DE-52 chromatographed protein; lanes 4 and 5 are the same oubain-Sepharose pool of single 
30 chain BABS except that lane 4 protein is reduced and lane 5 protein is unreduced. 

[0030] Figure 12 shows inhibition curves for 26-10 BABS and 26-10 Fab species, and indicates the relative.affinities 
of the antibody fragment for the indicated cardiac glycosides. 

[0031] Figures 13A and 13B are plots of digoxin binding curves. (A) shows 26-10 BABS binding isotherm and Sips 
plot (inset), and (B) shows 26-10 Fab binding isotherm and Sips plot (inset). 
35 [0032] Figure 14 is a nucleic acid sequence and corresponding amino acid sequence of a modified FB dimer leader 
sequence and various restriction sites. 

[0033] Figure 1 5A-1 5H are nucleic acid sequences and corresponding amino acid sequences of biosynthetic multi- 
functional proteins including a single chain BABS and various biologically active protein trailers linked via a spacer 
sequence. Also indicated are various endonuclease digestion sites. The trailing sequences are (A) epidermal growth 
40 factor (EGF); (B) streptavidin; (C) tumor necrosis factor (TNF); (D) calmodulin; (E) platelet derived growth factor-beta 
(PDGF-beta); (F) ricin; and (G) interleukin-2, and (H) an FB-FB dimer. 

Description 

45 [0034] The invention will first be described in its broadest overall aspects with a more detailed description following. 
[0035] A class of novel biosynthetic, bi or multifunctional proteins has now been designed and engineered which 
comprise biosynthetic antibody binding sites, that is, "BABS" or biosynthetic polypeptides defining structure capable 
of selective antigen recognition and preferential antigen binding, and one or more peptide-bonded additional protein 
or polypeptide regions designed to have a preselected property. Examples of the second region include amino acid 

50 sequences designed to sequester ions, which makes the protein suitable for use as an imaging agent, and sequences 
designed to facilitate immobilization of the protein for use in affinity chromatography and solid phase immunoassay. 
Another example of the second region is a bioactive effector molecule, that is, a protein having a conformation suitable 
for biological activity, such as an enzyme, toxin, receptor, binding site, growth factor, cell differentiation factor, lym- 
phokine, cytokine, hormone, or anti-metabolite. This invention features synthetic, multifunctional proteins comprising 

55 these . regions peptide bonded to one or more biosynthetic antibody binding sites, synthetic, single chain proteins 
designed to bind preselected antigenic determinants with high affinity and specificity, constructs containing multiple 
binding sites linked together to provide multipoint antigen binding and high net affinity and specificity, DNA encoding 
these proteins prepared by recombinant techniques, host cells harboring these DNAs, and methods for the production 
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of these proteins and DNAs. 

[0036] It has also been discovered that biosynthetic domains mimicking the structure of the two chains of an immu- 
noglobulin binding site may be connected by a polypeptide linker while closely approaching, retaining, and often im- 
proving their collective binding properties. 

5 [0037] The binding site region of the multifunctional proteins comprises at least one, and preferably two domains, 
each of which has an amino acid sequence homologous to portions of the CDRs of the variable domain of an immu- 
noglobulin light or heavy chain, and other sequence homologous to the FRs of the variable domain of the same, or a 
second, different immunoglobulin light or heavy chain. The two domain binding site construct also includes a polypeptide 
linking the domains. Polypeptides so constructed bind a specific preselected antigen determined by the CDRs held in 

io proper conformation by the FRs and the linker. Preferred structures have human FRs, i.e., mimic the amino acid se- 
quence of at least a portion of the framework regions of a human immunoglobulin, and have linked domains which 
together comprise structure mimicking a V H -V L or V L -V H immunoglobulin two-chain binding site. CDR regions of a 
mammalian immunoglobulin, such as those of mouse, rat, or human origin are preferred. The biosynthetic antibody 
binding site comprises FRs homologous with a portion of the FRs of a human immunoglobulin and CDRs homologous 

15 with CDRs from a mouse or rat immunoglobulin. This type of chimeric polypeptide displays the antigen binding spe- 
cificity of the mouse or rat immunoglobulin, while its human framework minimizes human immune reactions. In addition, 
the chimeric polypeptide may comprise other amino acid sequences. It may comprise, for example, a sequence ho- 
mologous to a portion of the constant domain of an immunoglobulin, but preferably is free of constant regions (other 
than FRs). 

20 [0038] The binding site region(s) of the chimeric proteins are thus single chain composite polypeptides comprising 
a structure which in solution behaves like an antibody binding site. The two domain, single chain composite polypeptide 
has a structure patterned after tandem V H and V L domains, but with the carboxyl terminal of one attached through a 
linking amino acid sequence to the amino terminal of the other. The linking amino acid sequence may or may not itself 
be antigenic or biologically active. It preferably spans a distance of at least about 4 nm (40A), i.e., comprises at least 

25 about 14 amino acids, and comprises residues which together present a hydrophilic, relatively unstructured region. 
Linking amino acid sequences having little or no secondary structure work well. Optionally, one or a pair of unique 
amino acids or amino acid sequences recognizable by a site specific cleavage agent may be included in the linker. 
This permits the V H and V L -like domains to be separated after expression, or the linker to be excised after refolding of 
the binding site. 

30 [0039] Either the amino or carboxyl terminal ends (or both ends) of these chimeric, single chain binding sites are 
attached to an amino acid sequence which itself is bioactive or has some other function to produce a bifunctional or 
multifunctional protein. For example, the synthetic binding site may include a leader and/or trailer sequence defining 
a polypeptide having enzymatic activity, independent affinity for an antigen different from the antigen to which the 
binding site is directed, or having other functions such as to provide a convenient site of attachment for a radioactive 

35 ion, or to provide a residue designed to link chemically to a solid support. This fused, independently functional section 
of protein should be distinguished from fused leaders used simply to enhance expression in prokaryotic host cells or 
yeasts. The multifunctional proteins also should be distinguished from the "conjugates" disclosed in the prior art com- 
prising antibodies which, after expression, are linked chemically to a second moiety. 

[0040] Often, a series of amino acids designed as a "spacer" is interposed between the active regions of the multi- 
40 functional protein. Use of such a spacer can promote independent refolding of the regions of the protein. The spacer 
also may include a specific sequence of amino acids recognized by an endopeptidase, for example, endogenous to a 
target cell (e.g., one having a surface protein recognized by the binding site) so that the bioactive effector protein is 
cleaved and released at the target. The second functional protein preferably is present as a trailer sequence, as trailers 
exhibit less of a tendency to interfere with the binding behavior of the BABS. 
45 [0041] The therapeutic use of such "self-targeted" bioactive proteins offers a number of advantages over conjugates 
of immunoglobulin fragments or complete antibody molecules: they are stable, less immunogenic and have a lower 
molecular weight; they can penetrate body tissues more rapidly for purposes of imaging or drug delivery because of 
their smaller size; and they can facilitate accelerated clearance of targeted isotopes or drugs. Furthermore, because 
design of such structures at the DNA level as disclosed herein permits ready selection of bioproperties and specificities, 
50 an essentially limitless combination of binding sites and bioactive proteins is. possible, each of which can be refined 
as disclosed herein to optimize independent activity at each region of the synthetic protein. The synthetic proteins can 
be expressed in procaryotes such as E, coli , and thus are less costly to produce than immunoglobulins or fragments 
thereof which require expression in cultured animal cell lines. 

[0042] The invention thus provides a family of recombinant proteins expressed from a single piece of DNA, ail of 
55 which have the capacity to bind specifically with a predetermined antigenic determinant. The preferred species of the 
proteins comprise a second domain which functions independently of the binding region. In this aspect the invention 
provides an array of "self-targeted" proteins which have a bioactive function and which deliver that function to a locus 
determined by the binding site's specificity. It also provides biosynthetic binding proteins having attached polypeptides 
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suitable for attachment to immobilization matrices which may be used in affinity chromatography and solid phase 
immunoassay applications, or suitable for attachment to ions, e.g., radioactive ions, which may be used for in vivo 
imaging. 

[0043] The successful design and manufacture of the proteins of the invention depends on the ability to produce 
5 biosynthetic binding sites, and most preferably, sites comprising two domains mimicking the variable domains of im- 
munoglobulin connected by a linker. 

[0044] As is now well known, Fv, the minimum antibody fragment which contains a complete antigen recognition and 
binding site, consists of a dimer of one heavy and one light chain variable domain in noncovalent association (Figure 
1 A). It is in this configuration that the three complementarity determining regions of each variable domain interact to 

w define an antigen binding site on the surface of the V H -V L dimer. Collectively, the six complementarity determining 
regions (see Figure 1 B) confer antigen binding specificity to the antibody. FRs flanking the CDRs have a tertiary struc- 
ture which is essentially conserved in native immunoglobulins of species as diverse as human and mouse. These FRs 
serve to hold the CDRs in their appropriate orientation. The constant domains are not required for binding function, 
but may aid in stabilizing V H -V L interaction. Even a single variable domain (or half of an Fv comprising only three CDRs 

15 specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than an entire binding 
site (Painter et al. (1972) Biochem. 11; 1327- 1337). 

[0045] This knowledge of the structure of immunoglobulin proteins has now been exploited to develop multifunctional 

fusion proteins comprising biosynthetic antibody binding sites and one or more other domains. 

[0046] The structure of these biosynthetic proteins in the region which impart the binding properties to the protein is 

20 analogous to the Fv region of a natural antibody. It comprises at least one, and preferably two domains consisting of 
amino acids defining V H and V L -like polypeptide segments connected by a linker which together form the tertiary mo- 
lecular structure responsible for affinity and specificity. Each domain comprises a set of amino acid sequences analo- 
gous to immunoglobulin CDRs held in appropriate conformation by a set of sequences analogous to the framework 
regions (FRs) of an Fv fragment of a natural antibody. 

25 [0047] The term CDR, as used herein, refers to amino acid sequences which together define the binding affinity and 
specificity of the natural Fv region of a native immunoglobulin binding site, or a synthetic polypeptide which mimics 
this function. CDRs typically are not wholly homologous to hypervariable regions of natural Fvs, but rather also may 
include specific amino acids or amino acid sequences which flank the hypervariable region and have heretofore been 
considered framework not directly determinitive of complementarity. The term FR, as used herein, refers to amino acid 

30 sequences flanking or interposed between CDRs. 

[0048] The CDR and FR polypeptide segments are designed based on sequence analysis of the Fv region of pre- 
existing antibodies or of the DNA encoding them. In one embodiment, the amino acid sequences constituting the FR 
regions of the BABS are analogous to the FR sequences of a first preexisting antibody, for example, a human IgG. 
The amino acid sequences constituting the CDR regions are analogous to the sequences from a second, different 

35 preexisting antibody, for example, the CDRs of a murine IgG. Alternatively, the CDRs and FRs from a single preexisting 
antibody from, e.g., an unstable or hard to culture hybridoma, may be copied in their entirety. 
[0049] The design and biosynthesis of various reagents is enabled, all of which are characterized by a region having 
affinity for a preselected antigenic determinant. The binding site and other regions of the biosynthetic protein are de- 
signed with the particular planned utility of the protein in mind. Thus, if the reagent is designed for intravascular use 

40 in mammals, the FR regions may comprise amino acids similar or identical to at least a portion of the framework region 
amino acids of antibodies native to that mammalian species. On the other hand, the amino acids comprising the CDRs 
may be analogous to a portion of the amino acids from the hypervariable region (and certain flanking amino acids) of 
an antibody having a known affinity and specificity, e.g., a murine or rat monoclonal antibody. 
[0050] Other sections of native immunoglobulin protein structure, e.g., C H and C L , need not be present and normally 

45 are intentionally omitted from the biosynthetic proteins. However, the proteins of the invention normally comprise ad- 
ditional polypeptide or protein regions defining a bioactive region, e.g., a toxin or enzyme, or a site onto which a toxin 
or a remotely detectable substance can be attached. 

[0051] The invention thus can provide intact biosynthetic antibody binding sites analogous to V H -V L dimers, either 
non-covalently associated, disulfide bonded, or preferably linked by a polypeptide sequence to form a composite V H - 

50 v L or V L -V H polypeptide which may be essentially free of antibody constant region. The invention also provides proteins 
analogous to an independent V H or V L domain, or dimers thereof. Any of these proteins may be provided in a form 
linked to, for example, amino acids analogous or homologous to a bioactive molecule such as a hormone or toxin. 
[0052] Connecting the independently functional regions of the protein is a spacer comprising a short amino acid 
sequence whose function is to separate the functional regions so that they can independently assume their active 

55 tertiary conformation. The spacer can consist of an amino acid sequence present on the end of a functional protein 
which sequence is not itself required for its function, and/or specific sequences engineered into the protein at the DNA 
level. 

[0053] The spacer generally may comprise between 5 and 25 residues. Its optimal length may be determined using 
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constructs of different spacer lengths varying, for example, by units of 5 amino acids. The specific amino acids in the 
spacer can vary. Cysteines should be avoided. Hydrophilic amino acids are preferred. The spacer sequence may mimic 
the sequence of a hinge region of an immunoglobulin. It may also be designed to assume a structure, such as a helical 
structure. Proteolytic cleavage sites may be designed into the spacer separating the variable region-like sequences 
5 from other pendant sequences so as to facilitate cleavage of intact BABS, free of other protein, or so as to release the 
bioactive protein in vivo . 

[0054] Figures 2A-2E illustrate five examples of protein structures embodying the single polypeptide chain and linkers 
of the invention that can be produced by following the teaching disclosed herein. All of these examples are characterized 
by a biosynthetic polypeptide defining a binding site 3, comprising amino acid sequences comprising CDRs and FRs, 

10 often derived from different immunoglobulins, or sequences homologous to a portion of CDRs and FRs from different 
immunoglobulins. Figure 2A depicts a single chain construct comprising a polypeptide domain 10 having an amino 
acid sequence analogous to the variable region of an immunoglobulin heavy chain, bound through its carboxyl end to 
a polypeptide linker 12, which in turn is bound to a polypeptide domain 14 having an amino acid sequence analogous 
to the variable region of an immunoglobulin light chain. Of course, the light and heavy chain domains may be in reverse 

15 order. Alternatively, the binding site may comprise two substantially homologous amino acid sequences which are both 
analogous to the variable region of an immunoglobulin heavy or light chain. 

[0055] The linker 12 should be long enough (e.g., about 15 amino acids or about 4 nm (40 A) to permit the chains 
10 and 14 to assume their proper conformation. The linker 12 may comprise an amino acid sequence homologous to 
a sequence identified as "self" by the species into which it will be introduced, if drug use is intended. For example, the 

20 linker may comprise an amino acid sequence patterned after a hinge region of an immunoglobulin. The linker preferably 
comprises hydrophilic amino acid sequences. It may also comprise a bioactive polypeptide such as a cell toxin which 
is to be targeted by the binding site, or a segment easily labelled by a radioactive reagent which is to be delivered, e. 
g., to the site of a tumor comprising an epitope recognized by the binding site. The linker may also include one or two 
built-in cleavage sites, i.e., an amino acid or amino acid sequence susceptible to attack by a site specific cleavage 

25 agent as described below. This strategy permits the V H and V L -like domains to be separated after expression, or the 
linker to be excised after folding while retaining the binding site structure in non-covalent association. The amino acids 
of the linker preferably are selected from among those having relatively small, unreactive side chains. Alanine, serine, 
and glycine are preferred. 

[0056] Generally, the design of the linker involves considerations similar to the design of the spacer, excepting that 

30 binding properties of the linked domains are seriously degraded if the linker sequence is shorter than about 2 nm (20A) 
in length, i.e., comprises less than about 10 residues. Linkers longer than the approximate 4 nm (40A) distance between 
the N terminal of a native variable region and the C-terminal of its sister chain may be used, but also potentially can 
diminish the BABS binding properties. Linkers comprising between 12 and 18 residues are preferred. The preferred 
length in specific constructs may be determined by varying linker length first by units of 5 residues, and second by 

35 units of 1-4 residues after determining the best multiple of the pentameric starting units. 

[0057] Additional proteins or polypeptides may be attached to either or both the amino or carboxyl termini of the 
binding site to produce multifunctional proteins of the type illustrated in Figures 2B-2E. As an example, in Figure 2B, 
a helically coiled'polypeptide structure 16 comprises a protein A fragment (FB) linked to the amino terminal end of a 
V H -like domain 10 via a spacer 18. Figure 2C illustrates a bifunctional protein having an effector polypeptide 20 linked 

40 via spacer 22 to the carboxyl terminus of polypeptide 14 of binding protein segment 2. This effector polypeptide 20 
may consist of, for example, a toxin, therapeutic drug, binding protein, enzyme or enzyme fragment, site of attachment 
for an imaging agent (e.g., to chelate a radioactive ion such as indium), or site of selective attachment to an immobi- 
lization matrix so that the BABS can be used in affinity chromatography or solid phase binding assay. This effector 
alternatively may be linked to the amino terminus of polypeptide 10, although trailers are preferred. Figure 2D depicts 

45 a trifunctional protein comprising a linked pair of BABS 2 having another distinct protein domain 20 attached to the N- 
terminus of the first binding protein segment. Use of multiple BABS in a single protein enables production of constructs 
having very high selective affinity for multiepitopic sites such as cell surface proteins. 

[0058] The independently functional domains are attached by a spacer 1 8 (Figs 2B and 2D) covalently linking the C 
terminus of the protein 1 6 or 20 to the N-terminus of the first domain 1 0 of the binding protein segment 2, or by a spacer 

50 22 linking the C-terminus of the second binding domain 14 to the N-terminus of another protein (Figs. 2C and 2D). The 
spacer may be an amino acid sequence analogous to linker sequence 1 2, or it may take other forms. As noted above, 
the spacer's primary function is to separate the active protein regions to promote their independent bioactivity and 
permit each region to assume its bioactive conformation independent of interference from its neighboring structure. 
[0059] Figure 2E depicts another type of reagent, comprising a BABS having only one set of three CDRs, e.g., 

55 analogous to a heavy chain variable region, which retains a measure of affinity for the antigen. Attached to the carboxyl 
end of the polypeptide 10 or 14 comprising the FR and CDR sequences constituting the binding site 3 through spacer 
22 is effector polypeptide 20 as described above. 

[0060] As is evidenced from the foregoing, the invention provides a large family of reagents comprising proteins, at 
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least a portion of whichdefines a binding site patterned after the variable region of an immunoglobulin. It will be apparent 
" that the nature of any protein fragments linked to the BABS, and used for reagents embodying the invention, are 
essentially unlimited, the essence of the invention being the provision, either alone or linked to other proteins, of binding 
sites having specificities to any antigen desired. 

5 [0061] The clinical administration of multifunctional proteins comprising a BABS, or a BABS alone, affords a number 
of advantages over the use of intact natural or chimeric antibody molecules, fragments thereof, and conjugates com- 
prising such antibodies linked chemically to a second bioactive moiety. The multifunctional proteins described herein 
offer fewer cleavage sites to circulating proteolytic enzymes, their functional domains are connected by peptide bonds 
to polypeptide linker or spacer sequences, and thus the proteins have improved stability. Because of their smaller size 

10 and efficient design, the multifunctional proteins described herein reach their target tissue more rapidly, and are cleared 
more quickly from the body. They also have reduced immunogenicity. In addition, their design facilitates coupling to 
other moieties in drug targeting and imaging application. Such coupling may be conducted chemically after expression 
of the BABS to a site of attachment for the coupling product engineered into the protein at the DNA level. Active effector 
proteins having toxic, enzymatic, binding, modulating, cell differentiating, hormonal, or other bioactivity are expressed 

15 from a single DNA as a leader and/or trailer sequence, peptide bonded to the BABS. 

Design'and Manufacture 

[0062] The single polypeptide chain and linkers of the invention are designed at the DNA level. The chimeric or 
20 synthetic DNAs are then expressed in a suitable host system, and the expressed proteins are collected and renatured 
if necessary. A preferred general structure of the DNA encoding the proteins is set forth in Figure 8. As illustrated, it 
encodes an optimal leader sequence used to promote expression in procaryotes having a built-in cleavage site rec- 
ognizable by a site specific cleavage agent, for example, an endopeptidase, used to remove the leader after expression. 
This is followed by DNA encoding a V H -like domain, comprising CDRs and FRs, a linker, a V L -like domain, again 
25 comprising CDRs and FRs, a spacer, and an effector protein. After expression, folding, and cleavage of the leader, a 
bifunctional protein is produced having a binding region whose specificity is determined by the CDRs, and a peptide- 
linked independently functional effector region. 

[0063] The ability to design the BABS of the invention depends on the ability to determine the sequence of the amino 
acids in the variable region of monoclonal antibodies of interest, or the DNA encoding them. Hybridoma technology 

30 enables production of cell lines secreting antibody to essentially any desired substance that produces an immune 
response. RNA encoding the light and heavy chains of the immunoglobulin can then be obtained from the cytoplasm 
of the hybridoma. The 5' end portion of the mRNA can be used to prepare cDNA for subsequent sequencing, or the 
amino acid sequence of the hypervariable and flanking framework regions can be determined by amino acid sequencing 
of the V region fragments of the H and L chains. Such sequence analysis is now conducted routinely. This knowledge, 

35 coupled with observations and deductions of the generalized structure of immunoglobulin Fvs, permits one to design 
synthetic genes encoding FR and CDR sequences which likely will bind the antigen. These synthetic genes are then 
prepared using known techniques, or using the technique disclosed below, inserted into a suitable host, and expressed, 
and the expressed protein is purified. Depending on the host cell, renaturation techniques may be required to attain 
proper conformation. The various proteins are then tested for binding ability, and one having appropriate affinity is 

40 selected for incorporation into a reagent of the type described above. If necessary, point substitutions seeking to op- 
timize binding may be made in the DNA using conventional casette mutagenesis or other protein engineering meth- 
odology such as is disclosed below. 

[0064] Preparation of the proteins of the invention also is dependent on knowledge of the amino acid sequence (or 
corresponding DNA or RNA sequence) of bioactive proteins such as enzymes, toxins, growth factors, cell differentiation 
45 factors, receptors, anti-metabolites, hormones or various cytokines or lymphokines. Such sequences are reported in 
the literature and available through computerized data banks. 

[0065] The DNA sequences of the binding site and the second protein domain are fused using conventional tech- 
niques, or assembled from synthesized oligonucleotides, and then expressed using equally conventional techniques. 
[0066] The processes for manipulating, amplifying, and recombining DNA which encode amino acid sequences of 
so interest are generally well known in the art, and therefore, not described in detail herein. Methods of identifying and 
isolating genes encoding antibodies of interest are well understood, and described in the patent and other literature. 
In general, the methods involve selecting genetic material coding for amino acids which define the proteins of interest, 
including the CDRs and FRs of interest, according to the genetic code. 

[0067] Accordingly, the construction of DNAs encoding proteins as disclosed herein can be done using known tech- 
55 niques involving the use of various restriction enzymes which make sequence specific cuts in DNA to produce blunt 
ends or cohesive ends, DNA ligases, techniques enabling enzymatic addition of sticky ends to blunt-ended DNA, 
construction of synthetic DNAs by assembly of short or medium length oligonucleotides, cDNA synthesis techniques, 
and synthetic probes for isolating immunoglobulin or other. bioactive protein genes. Various promoter sequences and 
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other regulatory DNA sequences used in achieving expression, and various types of host cells are also known and 
available. Conventional transfection techniques, and equally conventional techniques for cloning and subcloning DNA 
are useful in the practice of this invention and known to those skilled in the art. Various types of vectors may be used 
such as plasmids and viruses including animal viruses and bacteriophages. The vectors may exploit various marker 

5 genes which impart to a successfully transfected cell a detectable phenotypic property that can be used to identify 
which of a family of clones has successfully incorporated the recombinant DNA of the vector. 
[0068] One method for obtaining DNA encoding the proteins disclosed herein is by assembly of synthetic oligonu- 
cleotides produced in a conventional, automated, polynucleotide synthesizer followed by ligation with appropriate ligas- 
es. For example, overlapping, complementary DNA fragments comprising 15 bases may be synthesized semi manually 

10 using phosphoramidite chemistry, with end segments left unphosphorylated to prevent polymerization during ligation. 
One end of the synthetic DNA is left with a "sticky end" corresponding to the site of action of a particular restriction 
endonuclease, and the other end is left with an end corresponding to the site of action of another restriction endonu- 
clease. Alternatively, this approach can be fully automated. The DNA encoding the protein may be created by synthe- 
sizing longer single strand fragments (e.g., 50-100 nucleotides long) in, for example, a Biosearch oligonucleotide syn- 

15 thesizer, and then ligating the fragments. 

[0069] A method of producing BABS is to produce a synthetic DNA encoding a polypeptide comprising, e.g., human 
FRs, and intervening "dummy" CDRs, or amino acids having no function except to define suitably situated unique 
restriction sites. This synthetic DNA is then altered by DNA replacement, in which restriction and ligation is employed 
to insert synthetic oligonucleotides encoding CDRs defining a desired binding specificity in the proper location between 

20 the FRs. This approach facilitates empirical refinement of the binding properties of the BABS. 

[0070] This technique is dependent upon the ability to cleave a DNA corresponding in structure to a variable domain 
gene at specific sites flanking nucleotide sequences encoding CDRs. These restriction sites in some cases may be 
found in the native gene. Alternatively, non-native restriction sites may be engineered into the nucleotide sequence 
resulting in a synthetic gene with a different sequence of nucleotides than the native gene, but encoding the same 

25 variable region amino acids because of the degeneracy of the genetic code. The fragments resulting from endonuclease 
digestion, and comprising FR-encoding sequences, are then ligated to non-native CDR-encoding sequences to pro- 
duce a synthetic variable domain gene with altered antigen binding specificity. Additional nucleotide sequences en- 
coding, for example, constant region amino acids or a bioactive molecule may then be linked to the gene sequences 
to produce a bifunctional protein. 

30 [0071] The expression of these synthetic DNA's can be achieved in both prokaryotic and eucaryotic systems via 
transfection with an appropriate vector. In E. coli and other microbial hosts, the synthetic genes can be expressed as 
fusion protein which is subsequently cleaved. Expression in eucaryotes can be accomplished by the transfection of 
DNA sequences encoding CDR and FR region amino acids and the amino acids defining a second function into a 
myeloma or other type of cell line. By this strategy intact hybrid antibody molecules having hybrid Fv regions and 

35 various bioactive proteins including a biosynthetic binding site may be produced. For fusion protein expressed in bac- 
teria, subsequent proteolytic cleavage of the isolated fusions can be performed to yield free BABS, which can be 
renatured to obtain an intact biosynthetic, hybrid antibody binding site. 

[0072] Heretofore, it has not been possible to cleave the heavy and light chain region to separate the variable and 
constant regions of an immunoglobulin so as to produce intact Fv, except in specific cases not of commercial utility. 
40 However, one method of producing BABS is to redesign DNAs encoding the heavy and light chains of an immunoglob- 
ulin, optionally altering its specificity or humanizing its FRs, and incorporating a cleavage site and "hinge region" be- 
tween the variable and constant regions of both the heavy and light chains. Such chimeric antibodies can be produced 
in transfectomas or the like and subsequently cleaved using a preselected endopeptidase. 

[0073] The hinge region is a sequence of amino acids which serve to promote efficient cleavage by a preselected 
45 cleavage agent at a preselected, built-in cleavage site. It is designed to promote cleavage preferentially at the cleavage 
site when the polypeptide is treated with the cleavage agent in an appropriate environment. 
[0074] The hinge region can take many different forms. Its design involves selection of amino acid residues (and a 
DNA fragment encoding them) which impart to the region of the fused protein about the cleavage site an appropriate 
polarity, charge distribution, and stereochemistry which, in the aqueous environment where the cleavage takes place, 
50 efficiently exposes the cleavage site to the cleavage agent in preference to other potential cleavage sites that may be 
present in the polypeptide, and/or to improve the kinetics of the cleavage reaction. In specific cases, the amino acids 
of the hinge are selected and assembled in sequence based on their known properties, and then the fused polypeptide 
sequence is expressed, tested, and altered for refinement. 

[0075] The hinge region is free of cysteine. This enables the cleavage reaction to be conducted under conditions in 
55 which the protein assumes its tertiary conformation, and may be held in this conformation by intramolecular disulfide 
bonds. It has been discovered that in these conditions access of the protease to potential cleavage sites which may 
be present within the target protein is hindered. The hinge region may comprise an amino acid sequence which includes 
one or more proline residues. This allows formation of a substantially unfolded molecular segment. Aspartic acid, 
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glutamic acid, arginine, lysine, serine, and threonine residues maximize ionic interactions and may be present in 
amounts and/or in sequence which renders the moiety comprising the hinge water soluble. 

[0076] The cleavage site preferably is immediately adjacent the Fv polypeptide chains and comprises one amino 
acid or a sequence of amino acids exclusive of any sequence found in the amino acid structure of the chains in the 

5 Fv. The cleavage site preferably is designed for unique or preferential cleavage by a specific selected agent. Endopepti- 
dases are preferred, although non-enzymatic (chemical) cleavage agents may be used. Many useful cleavage agents, 
for instance, cyanogen bromide, dilute acid, trypsin, Staphylococcus aureus V-8 protease, post proline cleaving en- 
zyme, blood coagulation Factor Xa, enterokinase, and renin, recognize and preferentially or exclusively cleave partic- 
ular cleavage sites. One currently preferred cleavage agent is V-8 protease. The currently preferred cleavage site is 

10 a Glu residue. Other useful enzymes recognize multiple residues as a cleavage site, e.g., factor Xa (lle-Glu-Gly-Arg) 
or enterokinase (Asp-Asp-^Asp-Asp-Lys). The principles of this selective cleavage approach may also be used in the 
design of the linker and spacer sequences of the multifunctional constructs of the invention where an exciseable linker 
or selectively cleavable linker or spacer is desired. 

15 Design of Synthetic V M and V, Mimics 

[0077] FRs from the heavy and light chain murine anti-digoxin monoclonal 26-10 (Figures 4A and 4B) were encoded 
on the same DNAs with CDRs from the murine anti-lysozyme monoclonal glp-4 heavy chain (Figure 3 sequence 1) 
and light chain to produce V H (Figure 4C) and V L (Figure 4D) regions together defining a biosynthetic antibody binding 
20 site which is specific for lysozyme. Murine CDRs from both the heavy and light chains of monoclonal glp-4 were encoded 
on the same DNAs with FRs from the heavy and light chains of human myeloma antibody NEWM (Figures 4E and 4F). 
The resulting interspecies chimeric antibody binding domain has reduced immunogenicity in humans because of its 
human FRs, and specificity for lysozyme because of its murine CDRs. 

[0078] A synthetic DNA was designed to facilitate CDR insertions into a human heavy chain FR and to facilitate 
25 empirical refinement of the resulting chimeric amino acid sequence. This DNA is depicted in Figure 5. 

[0079] A synthetic, bifunctional FB-binding site protein was also designed at the DNA level, expressed, purified, 
renatured, and shown to bind specifically with a preselected antigen (digoxin) and Fc. The detailed primary structure 
of this construct is shown in Figure 6; its tertiary structure is illustrated schematically in Figure 2B. 
[0080] Details of these and other experiments, and additional design principles on which the invention is based, are 
30 set forth below. 

GENE DESIGN AND EXPRESSION 

[0081] Given known variable region DNA sequences, synthetic V L and V H genes may be designed which encode 

35 native or near native FR and CDR amino acid sequences from an antibody molecule, each separated by unique re- 
striction sites located as close to FR-CDR and CDR-FR borders as possible. Alternatively, genes may be designed 
which encode native FR sequences which are similar or identical to the FRs of an antibody molecule from a selected 
species, each separated by "dummy" CDR sequences containing strategically located restriction sites. These DNAs 
serve as starting materials for producing BABS, as the native or "dummy" CDR sequences may be excised and replaced 

*o with sequences encoding the CDR amino acids defining a selected binding site. Alternatively, one may design and 
directly synthesize native or near-native FR sequences from a first antibody molecule, and CDR sequences from a 
second antibody molecule. Any one of the V H and V L sequences described above may be linked together directly, via 
an amino acids chain or linker connecting the C-terminus of one chain with the N-terminus of the other. 
[0082] These genes, once synthesized, may be cloned with or without additional DNA sequences coding for, e.g., 

45 an antibody constant region, enzyme, or toxin, or a leader peptide which facilitates secretion or intracellular stability 
of a fusion polypeptide. The genes then can be expressed directly in an appropriate host cell, or can be further engi- 
neered before expression by the exchange of FR, CDR, or "dummy" CDR sequences with new sequences. This . 
manipulation is facilitated by the presence of the restriction sites which have been engineered into the gene at the 
FR-CDR and CDR-FR borders. 

so [0083] Figure 3 illustrates the general approach to designing a chimeric V H ; further details of exemplary designs at 
the DNA level are shown in Figures 4A-4F. Figure 3, lines 1 and 2, show the amino acid sequences of the heavy chain 
variable region of the murine monoclonals glp-4 (anti-lysozyme) and 26-10 (anti-digoxin), including the four FR and 
three CDR sequences of each. Line 3 shows the sequence of a chimeric V H which comprises 26-10 FRs and glp-4 
CDRs. As illustrated, the hybrid protein of line 3 is identical to the native protein of line 2, except that 1 ) the sequence 

55 TFTNYYIHWLK has replaced the sequence IFTDFYMNWVR, 2) EWIGWIYPGNGNTKYNENFKG has replaced DYI- 
GYISPYSGVTGYNQKFKG, 3) RYTHYYF has replaced GSSGNKWAM, and 4) A has replaced V as the sixth amino 
acid beyond CDR-2. These changes have the effect of changing the specificity of the 26-10 V H to mimic the specificity 
of glp-4. The Ala to Val single amino acid replacement within the relatively conserved framework region of 26-10 is an 



11 



T > 

EP 0 623 679 B1 

example of the replacement of an amino acid outside the hypervariable region made for the purpose of altering spe- 
cificity by CDR replacement. Beneath sequence 3 of Figure 3, the restriction sites in the DNA encoding the chimeric 
V H (see Figures 4A-4F) are shown which are disposed about the CDR-FR borders. 

[0084] Lines 4 and 5 of Figure 3 represent another construct. Line 4 is the full length V H of the human antibody 
5 NEWM. That human antibody may be made specific for lysozyme by CDR replacement as shown in line 5. Thus, for 
example, the segment TFTNYYIHWLK from glp-4 replaces TFSNDYYTWVR of NEWM, and its other CDRs are re- 
placed as shown. This results in a V H comprising a human framework with murine sequences determining specificity. 
[0085] By sequencing any antibody, or obtaining the sequence from the literature, in view of this disclosure one 
skilled in the art can produce a BABS of any desired specificity comprising any desired framework region. Diagrams 
10 such as Figure 3 comparing the amino acid sequence are valuable in suggesting which particular amino acids should 
be replaced to determine the desired complementarity. Expressed sequences may be tested for binding and refined 
by exchanging selected amino acids in relatively conserved regions, based on observation of trends in amino acid 
sequence data and/or computer modeling techniques. 

[0086] Significant flexibility in V H and V L design is possible because the amino acid sequences are determined at 

15 the DNA level, and the manipulation of DNA can be accomplished easily. 

[0087] For example, the DNA sequence for murine V H and V L 26-1 0 containing specific restriction sites flanking each 
of the three CDRs was designed with the aid of a commercially available computer program which performs combined 
reverse translation and restriction site searches ("RV.exe" by Compugene, Inc.). The known amino acid sequences for 
V H and V L 26-10 polypeptides were entered, and all potential DNA sequences which encode those peptides and all 

20 potential restriction sites were analyzed by the program. The program can, in addition, select DNA sequences encoding 
the peptide using only codons preferred by E. coli if this bacterium is to be host expression organism of choice. Figures 
4A and 4B show an example of program output. The nucelic acid sequences of the synthetic gene and the corresponding 
amino acids are shown. Sites of restriction endonuciease cleavage are also indicated. The CDRs of these synthetic 
genes are underlined. 

25 [0088] The DNA sequences for the synthetic 26-10 V H and V L are designed so that one or both of the restriction 
sites flanking each of the three CDRs are unique. A six base site (such as that recognized by Bsm I or BspM I) is 
preferred, but where six base sites are not possible, four or five base sites are used. These sites, if not already unique, 
are rendered unique within the gene by eliminating other occurrences within the gene without altering necessary amino 
acid sequences. Preferred cleavage sites are those that, once cleaved, yield fragments with sticky ends just outside 

30 of the boundary of the CDR within the framework. However, such ideal sites are only occasionally possible because 
the FR-CDR boundary is not an absolute one, and because the amino acid sequence of the FR may not permit a 
restriction site. In these cases, flanking sites in the FR which are more distant from the predicted boundary are selected. 
[0089] Figure 5 discloses the nucleotide and corresponding amino acid sequence (shown in standard single letter 
code) of a synthetic DNA comprising a master framework gene having the generic structure: 

35 

R-j -FR-j -X-j -FR2"X2"FRg-Xg-FR4-R2 

where R 1 and R 2 are restricted ends which are to be ligated into a vector, and X v X 2 , and X 3 are DNA sequences 
40 whose function is to provide convenient restriction sites for CDR insertion. This particular DNA has murine FR se- 
quences and unique, 6-base restriction sites adjacent the FR borders so that nucleotide sequences encoding CDRs 
from a desired monoclonal can be inserted easily. Restriction endonuciease digestion sites are indicated with their 
abbreviations; enzymes of choice for CDR replacement are underscored. Digestion of the gene with the following 
restriction endonucleases results in 3' and 5' ends which can easily be matched up with and ligated to native or synthetic 
45 CDRs of desired specificity; Kpnl and BstXI are used for ligation of CDR.,; Xbal and Dral for CDR 2 ; and BssHII and 
Clal for CDR 3 . 

OLIGONUCLEOTIDE SYNTHESIS 

50 [0090] The synthetic genes and DNA fragments designed as described above preferably are produced by assembly 
of chemically synthesized oligonucleotides. 15-100mer oligonucleotides may be synthesized on a Biosearch DNA 
Model 8600 Synthesizer, and purified by polyacrylarhide gel electrophoresis (PAGE) in Tris-Borate-EDTA buffer (TBE). 
The DNA is then electroeluted from the gel. Overlapping oligomers may be phosphorylated by T4 polynucleotide kinase 
and ligated into larger blocks which may also be purified by PAGE. 

55 

CLONING OF SYNTHETIC OLIGONUCLEOTIDES 

[0091] The blocks or the pairs of longer oligonucleotides may be cloned into E. coli using a suitable, e.g., pUC, 
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cloning vector. Initially, this vector may be altered by single strand mutagenesis to eliminate residual six base altered 
sites. For example, V H may be synthesized and cloned into pUC as five primary blocks spanning the following restriction 
sites: 1. EcoRI to first Narl site; 2. first Narl to Xbal; 3. Xbal to Sail; 4. Sail to Ncol; 5. Ncol to BamHI. These cloned 
fragments may then be isolated and assembled in several three-fragment ligations and cloning steps into the pUC8 
5 plasmid. Desired ligations selected by PAGE are then transformed into, for example, E. coli strain JM83, and plated 
onto LB Ampicillin + Xgal plates according to standard procedures. The gene sequence may be confirmed by supercoil 
sequencing after cloning, or after subcloning into M13 via the dideoxy method of Sanger. 

PRINCIPLE OF CDR EXCHANGE 

10 

[0092] Three CDRs (or alternatively, four FRs) can be replaced per V H or V L . In simple cases, this can be accom- 
plished by cutting the shuttle pUC plasmid containing the respective genes at the two unique restriction sites flanking 
each CDR or FR, removing the excised sequence, and ligating the vector with a native nucleic acid sequence or a 
synthetic oligonucleotide encoding the desired CDR or FR. This three part procedure would have to be repeated three 
15 times for total CDR replacement and four times for total FR replacement. Alternatively, a synthetic nucleotide encoding 
two consecutive CDRs separated by the appropriate FR can be ligated to a pUC or other plasmid containing a gene 
whose corresponding CDRs and FR have been cleaved out. This procedure reduces the number of steps required to 
perform CDR and/or FR exchange. 

20 EXPRESSION OF PROTEINS 

[0093] The engineered genes can be expressed in appropriate prokaryotic hosts such as various strains of E. coli, 
and in eucaryotic hosts such as Chinese hamster ovary cell, murine myeloma, and human myeloma/transfectoma cells. 
[0094] For example, if the gene is to be expressed in E. coli, it may first be cloned into an expression vector. This is 
25 accomplished by positioning the engineered gene downstream from a promoter sequence such as trp or tac, and a 
gene coding for a leader peptide. The resulting expressed fusion protein accumulates in refractile bodies in the cyto- 
plasm of the cells, and may be harvested after disruption of the cells by French press or sonication. The refractile 
bodies are solubilized, and the expressed proteins refolded and cleaved by the methods already established for many 
other recombinant proteins. 

30 [0095] If the engineered gene is to be expressed in myeloma cells, the conventional expression system for immu- 
noglobulins, it is first inserted into an expression vector containing, for example, the Ig promoter, a secretion signal, 
immunoglobulin enhancers, and various introns. This plasmid may also contain sequences encoding all or part of a 
constant region, enabling an entire part of a heavy or light chain to be expressed. The gene is transfected into myeloma 
cells via established electroporation or protoplast fusion methods. Cells so transfected can express V L or V H fragments, 

35 v L2 or V H2 homodimers, V L -V H heterodimers, V H -V L or V L -V H single chain polypeptides, complete heavy or light im- 
munoglobulin chains, or portions thereof, each of which may be attached in the various ways discussed above to a 
protein region having another function (e.g., cytotoxicity). 

[0096] Vectors containing a heavy chain V region (or V and C regions) can be cotransfected with analogous vectors 
carrying a light chain V region (or V and C regions), allowing for the expression of noncovalently associated binding 
40 sites (or complete antibody molecules). 

[0097] In the examples which follow, a specific example of how to make a single chain binding site is disclosed, 
together with methods employed to assess its binding properties. Thereafter, a protein construct having two functional 
domains is disclosed. Lastly, there is disclosed a series of additional targeted proteins. 

45 | EXAMPLE OF CDR EXCHANGE AND EXPRESSION 

[0098] The synthetic gene coding for murine V H and V L 26-1 0 shown in Figures 4A and 4B were designed from the 
known amino acid sequence of the protein with the aid of Compugene, a software program. These genes, although 
coding for the native amino acid sequences, also contain non-native and often unique restriction sites flanking nucleic 

50 acid sequences'encoding CDR's to facilitate CDR replacement as noted above. 

[0099] Both the 3' and 5' ends of the large synthetic oligomers were designed to include 6-base restriction sites, 
present in the genes and the pUC vector. Furthermore, those restriction sites in the synthetic genes which were only 
suited for assembly but not for cloning the pUC were extended by "helper" cloning sites with matching sites in pUC. 
[0100] Cloning of the synthetic DNA and later assembly of the gene is facilitated by the spacing of unique restriction 

55 sites along the gene. This allows corrections and modifications by cassette mutagenesis at any location. Among them 
are alterations near the 5' or 3' ends of the gene as needed for the adaptation to different expression vectors. For 
example, a Pstl site is positioned near the 5* end of the V H gene. Synthetic linkers can be attached easily between this 
site and a restriction site in the expression plasmid. These genes were synthesized by assembling oligonucleotides 
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as described above using a Biosearch Model 8600 DNA Synthesizer. They were ligated to vector pUC8 for transfor- 
mation of E. coli . 

[0101] Specific CDRs may be cleaved from the synthetic V H gene by digestion with the following pairs of restriction 
endonucleases: HpHI and BstXI for CDR^ Xbal and Dral for CDR 2 ; and Banll and Banl for CDR 3 . After removal on 

5 one CDR, another CDR of desired specificity may be ligated directly into the restricted gene, in its place if the 3' and 
5' ends of the restricted gene and the new CDR contain complementary single stranded DNA sequences. 
[0102] In the present example, the three CDRs of each of murine V H 26-10 and V L 26-10 were replaced with the 
corresponding CDRs of gIp-4. The nucleic acid sequences and corresponding amino acid sequences of the chimeric 
V H and V L genes encoding the FRs of 26-1 0 and CDRs of glp-4 are shown in Figures 4C and 4D. The positions of the 

10 restriction endonuclease cleavage sites are noted with their standard abbreviations. CDR sequences are underlined 
as are the restriction endonucleases of choice useful for further CDR replacement. 

[0103] These genes were cloned into pUC8, a shuttle plasmid. To retain unique restriction sites after cloning, the 
V H -like gene was spliced into the EcoRI and Hindi II or BamHI sites of the plasmid. 

[0104] Direct expression of the genes may be achieved in E. coli. Alternatively, the gene may be preceded by a 
15 leader sequence and expressed in E. coli as a fusion product by splicing the fusion gene into the host gene whose 
expression is regulated by interaction of a repressor with the respective operator. The protein can be induced by star- 
vation in minimal medium and by chemical inducers. The V H -V L biosynthetic 26-10 gene has been expressed as such 
a fusion protein behind the trp and tac promoters. The gene translation product of interest may then be cleaved from 
the leader in the fusion protein by e.g., cyanogen bromide degradation, tryptic digestion, mild acid cleavage, and/or 
20 digestion with factor Xa protease. Therefore, a shuttle plasmid containing a synthetic gene encoding a leader peptide 
having a site for mild acid cleavage, and into which has been spliced the synthetic BABS gene was used for this 
purpose. In addition, synthetic DNA sequences encoding a signal peptide for secretion of the processed target protein 
into the periplasm of the host cell can also be incorporated into the plasmid. 

[0105] After harvesting the gene product and optionally releasing it from a fusion peptide, its activity as an antibody 

25 binding site and its specificity for glp-4 (lysozyme) epitope are assayed by established immunological techniques, e. 
g., affinity chromatography and radioimmunoassay. Correct folding of the protein to yield the proper three-dimensional 
conformation of the antibody binding site is prerequisite for its activity. This occurs spontaneously in a host such as a 
myeloma cell which naturally expresses immunoglobulin proteins. Alternatively, for bacterial expression, the protein 
forms inclusion bodies which, after harvesting, must be subjected to a specific sequence of solvent conditions (e.g., 

30 diluted 20 X from 8 M urea 0.1 M Tris-HCI pH 9 into 0.15 M NaCI, 0.01 M sodium phosphate, pH 7.4 (Hochman et al. 
(1976) Biochem. 15:2706-2710) to assume its correct conformation and hence its active form. 
[0106] Figures 4E and 4F show the DNA and amino acid sequence of chimeric V H and V L comprising human FRs 
from NEWM and murine CDRs from glp-4. The CDRs are underlined, as are restriction sites of choice for further CDR 
replacement or empirically determined refinement. 

35 [0107] These constructs also constitute master framework genes, this time constructed of human framework se- 
quences. They may be used to construct BABS of any desired specificity by appropriate CDR replacement. 
[0108] Binding sites with other specificities have also been designed using the methodologies disclosed herein. 
Examples include those having FRs from the human NEWM antibody and CDRs from murine 26-1 0 (Figure 9A), murine 
26-10 FRs and G-loop CDRs (Figure 9B), FRs and CDRs from murine MOPC-315 (Figure 9C), FRs and CDRs from 

40 an anti-human carcinoembryonic antigen monoclonal antibody (Figure 9D), and FRs and CDRs 1 , 2, and 3 from V L 
and FRs and CDR 1 and 3 from the V H of the anti-CEA antibody, with CDR 2 from a consensus immunoglobulin gene 
(Figure 9E). 

II. Model Binding Site: 

45 

[01 09] The digoxin binding site of the lgG 2a> k monoclonal antibody 26-1 0 has been analyzed by Mudgett-Hunter and 
colleagues (unpublished). The 26-10 V region sequences were determined from both amino acid sequencing and DNA 
sequencing of 26-10 H and L chain mRNA transcripts (D. Panka, J.N. & M.N.M., unpublished data). The 26-10 antibody 
exhibits a high digoxin binding affinity [K 0 = 5.4 X 1 0 9 M' 1 ] and has a well-defined specificity profile, providing a baseline 
50 for comparison with the biosynthetic binding sites mimicking its structure. 

Protein Design : 

[0110] Crystallographically determined atomic coordinates for Fab fragments of 26-10 were obtained from the 
55 Brookhaven Data Bank. Inspection of the available three-dimensional structures of Fv regions within their parent Fab 
fragments indicated that the Euclidean distance between the C-terminus of the V H domain and the N-terminus of the 
V L domain is about 3.5 nm (35 A). Considering that the peptide unit length is approximately 3.8 A, a 15 residue linker 
was selected to bridge this gap. The linker was designed so as to exhibit little propensity for secondary structure and 
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not to interfere with domain folding. Thus, the 15 residue sequence (Gly-Gly-Gly-G!y-Ser) 3 was selected to connect 
the V H carboxyl- and V L amino-termini. 

[0111] Binding studies with single chain binding sites having less than or greater than 15 residues demonstrate the 
importance of the prerequisite distance which must separate V H from V L ; for example, a (Gly 4 -Ser) 1 linker does not 
s demonstrate binding activity, and those with (Gly 4 -Ser) 5 linkers exhibit very low activity compared to those with (Gly 4 - 
Ser) 3 linkers. 

Gene Synthesis : 

10 [01 1 2] Design of the 744 base sequence for the synthetic binding site gene was derived from the Fv protein sequence 
of 26-10 by choosing codons frequently used in E. coli. The model of this representative synthetic gene is shown in 
Figure 8, discussed previously. Synthetic genes coding for the trp promoter-operator, the modified trg LE leader peptide 
(MLE), the sequence of which is shown in Figure 10A, and V H were prepared largely as described previously. The 
gene coding for V H was assembled from 46 chemically synthesized oligonucleotides, all 15 bases long, except for 

15 terminal fragments (13 to 19 bases) that included cohesive cloning ends. Between 8 and 15 overlapping oligonucle- 
otides were enzymatically ligated into double stranded DNA, cut at restriction sites suitable for cloning (Narl, Xbal, 
Sail, Sacll, Sad), purified by PAGE on 8% gels, and cloned in pUC which was modified to contain additional cloning 
sites in the polylinker. The cloned segments were assembled stepwise into the complete gene mimicking V H by ligations 
in the pUC cloning vector. 

20 [0113] The gene mimicking 26-10 V L was assembled from 12 long synthetic polynucleotides ranging in size from 33 
to 88 base pairs, prepared in automated DNA synthesizers (Model 6500, Biosearch, San Rafael, CA; Model 380A, 
Applied Biosystems, Foster City, CA). Five individual double stranded segments were made out of pairs of long synthetic 
oligonucleotides spanning six-base restriction sites in the gene (Aatll, BstEII, Ppnl, Hindlll, Bglll, and Pstl). In one 
case, four long overlapping strands were combined and cloned. Gene fragments bounded by restriction sites for as- 

25 sembly that were absent from the pUC polylinker, such as Aatll and BstEII, were flanked by EcoRI and BamHI ends 
to facilitate cloning. 

[0114] The linker between V H and V L , encoding (Gly-Gly-Gly-Gly-Ser) 3 , was cloned from two long synthetic oligo- 
nucleotides, 54 and 62 bases long, spanning Sad and Aatll sites, the latter followed by an EcoRI cloning end. The 
complete single chain binding site gene was assembled from the V H , V L , and linker genes to produce a construct, 

30 corresponding to aspartyl-prolyl-V H -(linker)-V L , flanked by EcoRI and Pstl restriction sites. 

[0115] The tno promoter-operator, starting from its Sspl site, was assembled from 1 2 overlapping 1 5 base oligomers, 
and the MLE leader gene was assembled from 24 overlapping 15 base oligomers. These were cloned and assembled 
in pUC using the strategy of assembly sites flanked by cloning sites. The final expression plasmid was constructed in 
the pBR322 vector by a 3-part ligation using the sites Sspl, EcoRI, and Pstl (see Figure 10B). Intermediate DNA 

35 fragments and assembled genes were sequenced by the dideoxy method. 

Fusion Protein Expression ; 

[0116] Single-chain protein was expressed as a fusion protein. The MLE leader gene (Fig. 10A) was derived from 
40 E. coli trj} LE sequence and expressed under the control of a synthetic trj) promoter and operator. E. coli strain JM83 
was transformed with the expression plasmid and protein expression was induced in M9 minimal medium by addition 
of indoleacrylic acid (1 0 \ig/m\) at a cell density with A 600 = 1 . The high expression levels of the fusion protein resulted 
in its accumulation as insoluble protein granules, which were harvested from cell paste (Figure 11, Lane 1). 

45 Fusion Protein Cleavage : 

[0117] The MLE leader was removed from the binding site protein by acid cleavage of the Asp-Pro peptide bond 
engineered at the junction of the MLE and binding site sequences. The washed protein granules containing the fusion 
protein were cleaved in 6 M guanidine-HCI + 10% acetic acid, pH 2.5, incubated at -37°C for 96 hrs. The reaction was 
50 stopped through precipitation by addition of a 10-fold excess of ethanol with overnight incubation at -20°C, followed 
by centrifugation and storage at -20°C until further purification (Figure 11 , Lane 2). 

Protein Purification : 

55 [01 1 8] The acid cleaved binding site was separated from remaining intact fused protein species by chromatography 
on DEAE cellulose. The precipitate obtained from the cleavage mixture was redissolved in 6 M guanidine-HCI + 0.2 
M Tris-HCI, pH 8.2, + 0.1 M 2-mercaptoethanol and dialyzed exhaustively against 6 M urea + 2.5 mM Tris-HCI, pH 7.5, 
+ 1 mM EDTA. 2-Mercaptoethanol was added to a final concentration of 0.1 M, the solution was incubated for 2 hrs at 
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room temperature and loaded onto a 2.5 X 45 cm column of DEAE cellulose (Whatman DE 52), equilibrated with 6 M 
urea + 2.5 mM Tris-HCI + 1 mM EDTA, pH 7.5. The intact fusion protein bound weakly to the DE 52 column such that 
its elution was retarded relative to that of the binding protein. The first protein fractions which eluted from the column 
after loading and washing with urea buffer contained BABS protein devoid of intact fusion protein. Later fractions 
5 contaminated with some fused protein were pooled, rechromatographed on DE 52, and recovered single chain binding 
protein combined with other purified protein into a single pool (Figure 11, Lane 3). 

Refolding : 

10 [01 1 9] The 26-1 0 binding site mimic was refolded as follows: the DE 52 pool, disposed in 6 M urea + 2.5 mM Tris-HCI 
+ 1 mM EDTA, was adjusted to pH 8 and reduced with 0.1 M 2-mercaptoethanol at 37°C for 90 min. This was diluted 
at least 100-fold with 0.01 M sodium acetate, pH 5.5, to a concentration below 10 ug/ml and dialyzed at 4°C for 2 days 
against acetate buffer. 

15 Affinity Chromatography : 

[01 20] Purification of active binding protein by affinity chromatography at 4°C on a ouabain-amine-Sepharose column 
was performed. The dilute solution of refolded protein was loaded directly onto a pair of tandem columns, each con- 
taining 3 ml of resin equilibrated with the 0.01 M acetate buffer, pH 5.5. The columns were washed individually with an 

20 excess of the acetate buffer, and then by sequential additions of 5 ml each of 1 M NaCi, 20 mM ouabain, and 3 M 
potassium thiocyanate dissolved in the acetate buffer, interspersed with acetate buffer washes. Since digoxin binding 
activity was still present in the eluate, the eluate was pooled and concentrated 20-fold by ultrafiltration (PM 10 mem- 
brane, 200 ml concentrator; Amicon), reapplied to the affinity columns, and eluted as described. Fractions with signif- 
icant absorbance at 280 nm were pooled and dialyzed against PBSA or the above acetate buffer. The amounts of 

25 protein in the DE 52 and ouabain-Sepharose pools were quantitated by amino acid analysis following dialysis against 
0.01 M acetate buffer. The results are shown below in Table 1. 

TABLE 1 



Estimated Yields of BABS Protein During Purification 


Step 


Wet wt. Per 1 


mg protein 


Cleavage yield (%) 


Yield relative to fusion 






prior step 


protein 


Cell paste 


12.0 g 


1440.0 mg a 






Fusion protein 


2.3 g 


480.0 mg a ' b 


100.0% 


100.0% 


Granules 








38.0 e 


Acid Cleavage/DE 52 




144.0 mg 


38.0 e 


pool 






12.6 d 




Ouabain-Sepharose 




18.1 mg 


4.7 e 


pool 











30 



35 



40 



45 



determined by Lowry protein analysis 
determined by absorbance measurements 
determined by amino acid analysis 

^Calculated from the amount of BABS protein specifically eluted from ouabain-Sepharose relative to that applied to the resin; values were determined 
by amino acid analysis 

Percentage yield calculated on a molar basis 



50 



55 



Sequence Analysis of Gene and Protein : 

[0121] The complete gene was sequenced in both directions using the dideoxy method of Sanger which confirmed 
the gene was correctly assembled. The protein sequence was also verified by protein sequencing. Automated Edman 
degradation was conducted on intact protein (residues 1-40), as well as on two major CNBr fragments (residues 
108-129 and 140-159) with a Model 470A gas phase sequencer equipped with a Model 120A on-line phenylthiohy- 
dantoin-amino acid analyzer (Applied Biosystems, Foster City, CA). Homogeneous binding protein fractionated by 
SDS-PAGE and eluted from gel strips with water, was treated with a 20,000-fold excess of CNBr, in 1% trifluoroacetic 
acid-acetonitrile (1:1), for 12 hrs at 25° (in the dark). The resulting fragments were separated by SDS-PAGE and 
transferred electrophoretically onto an Immobilon membrane (Millipore, Bedford, MA), from which stained bands were 
cut out and sequenced. 
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Specificity Determination : 



10 



15 



20 



25 



30 



35 



[01 22] Specificities of anti-digoxin 26-1 0 Fab and the BABS were assessed by radioimmunoassay. Wells of microtiter 
plates were coated with affinity-purified goat anti-murine Fab fragment (ICN ImmunoBiologicals, Lisle, IL) at 10 uxj/ml 
in PBSA overnight at 4°C. After the plates were washed and blocked with 1% horse serum in PBSA, solutions (50 nl) 
containing 26-10 Fab or the BABS in either PBSA or 0.01 M sodium acetate at pH 5.5 were added to the wells and 
incubated 2-3 hrs at room temperature. After unbound antibody fragment was washed from the wells, 25 uJ of a series 
of concentrations of cardiac glycosides (1 0" 4 to 1 0" 11 M in PBSA) were added. The cardiac glycosides tested included 
digoxin, digitoxin, digoxigenin, digitoxigenin, gitoxin, ouabain, and acetyl strophanthidin. After the addition of 125 l-dig- 
oxin (25 ul, 50,000 cpm; Cambridge Diagnostics, Billerica, MA) to each well, the plates were incubated overnight at 
4°C, washed and counted. The inhibition curves are plotted in Figure 1 2. The relative affinities for each digoxin analogue 
were calculated by dividing the concentration of each analogue at 50% inhibition by the concentration of digoxin (or 
digoxigenin) that gave 50% inhibition. There is a displacement of inhibition curves for the BABS to lower glycoside 
concentrations than observed for 26-1 0 Fab, because less active BABS than 26-1 0 Fab was bound to the plate. When 
0.25 M urea was added to the BABS in 0.01 M sodium acetate, pH 5.5, more active sFv was bound to the goat anti- 
murine Fab coating on the plate. This caused the BABS inhibition curves to shift toward higher glycoside concentrations, 
closer to the position of those for 26-10 Fab, although maintaining the relative positions of curves for sFv obtained in 
acetate buffer alone. The results, expressed as normalized concentration of inhibitor giving 50% inhibition of 125 l- 
digoxin binding, are shown in Table 2. 

TABLE 2 



26-10 Antibody Species 


Normalizing Glycoside 


D 


DG 


DO 


DOG 


A-S 


G 


O 


Fab 


Digoxin 


1.0 


1.2 


0.9 


1.0 


1.3 


9.6 


15 




Digoxigenin 


0.9 


1.0 


0.8 


0.9 


1.1 


8.1 


13 


BABS 


Digoxin 


1.0 


7.3 


2.0 


2.6 


5.9 


62 


150 




Digoxigenin 


0.1 


1.0 


0.3 


0.4 


0.8 


8.5 


21 



D = Digoxin 

DG = Digoxigenin 

DO = Digitoxin 

DOG = Digitoxigenin 

A-S = Acetyl Strophanthidin 

G = Gitoxin 

O = Ouabain 



40 



45 



50 



55 



Affinity Determination: 

[0123] Association constants were measured by equilibrium binding studies. In immunoprecipitation experiments, 
100 ul of 3 H-digoxin (New England Nuclear, Billerica, MA) at a series of concentrations (10* 7 M to 10" 11 M) were added 
to 1 00 \l\ of 26-1 0 Fab or the BABS at a fixed concentration. After 2-3 hrs of incubation at room temperature, the protein 
was precipitated by the addition of 100 \l\ goat antiserum to murine Fab fragment (ICN ImmunoBiologicals), 50 pJ of 
the IgG fraction of rabbit anti-goat IgG (ICN ImmunoBiologicals), and 50 u.l of a 10% suspension of protein A-Sepharose 
(Sigma). Following 2 hrs at 4°C, bound and free antigen were separated by vacuum filtration on glass fiber filters 
(Vacuum Filtration Manifold, Millipore, Bedford, MA). Filter disks were then counted in 5 ml of scintillation fluid with a 
Model 1500Tri-Carb Liquid Scintillation Analyzer (Packard, Sterling, VA). The association constants, Kq, were calcu- 
lated from Scatchard analyses of the untransformed radioligand binding data using LIGAND, a non-linear curve fitting 
program based on mass action. K 0 s were also calculated by Sips plots and binding isotherms shown in Figure 13A for 
the BABS and 13B for the Fab. For binding isotherms, data are plotted as the concentration of digoxin bound versus 
the log of the unbound digoxin concentration, and the dissociation constant is estimated from the ligand concentration 
at 50% saturation. These binding data are also plotted in linear form as Sips plots (inset), having the same abscissa 
as the binding isotherm but with the ordinate representing log r/(n-r), defined below. The average intrinsic association 
constant (K 0 ) was calculated from the modified Sips equation (39), log (r/n-r) = a log C - a log K 0 , where r equals moles 
of digoxin bound per mole of antibody at an unbound digoxin concentration equal to C; n is the number of moles of 
digoxin bound at saturation of the antibody binding site, and a is an index of heterogeneity which describes the distri- 
bution of association constants about the average intrinsic association constant Kq. Least squares linear regression 
analysis of the data indicated correlation coefficients for the lines obtained were 0.96 for the BABS and 0.99 for 26-1 0 
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Fab. A summary of the calculated association constants are shown below in Table 3. 



TABLE 3 



Method of Data Analysis 


Association Constant, Kq 


K 0 (BABS), M" 1 


K 0 (Fab), M" 1 


Scatchard plot 
Sips plot 
Binding isotherm 


(3.2±0.9)X10 7 
2.6 X 10 7 
5.2 X10 7 


(1.9±0.2)X10 8 
1.8 X 10 8 
3.3X108 



III. Synthesis of a Multifunctional Protein 

[0124] A nucleic acid sequence encoding the single chain binding site described above was fused with a sequence 
encoding the FB fragment of protein A as a leader to function as a second active region. As a spacer, the native amino 
acids comprising the last 11 amino acids of the FB fragment bonded to an Asp-Pro dilute acid cleavage site was 
employed. The FB binding domain of the FB consists of the immediately preceding 43 amino acids which assume a 
helical configuration (see Fig. 2B). 

[01 25] The gene fragments are synthesized using a Biosearch DNA Model 8600 Synthesizer as described above. 
Synthetic oligonucleotides are cloned according to established protocol described above using the pUC8 vector trans- 
fected into E. coli. The completed fused gene set forth in Figure 6A is then expressed in E. coli. 
[01 26] After sonication, inclusion bodies were collected by centrifugation, and dissolved in 6 M guanidine hydrochlo- 
ride (GuHCI), 0.2 M Tris, and 0.1 M 2-mercaptoethanol (BME), pH 8.2. The protein was denatured and reduced in the 
solvent overnight at room temperature. Size exclusion chromatography was used to purify fusion protein from the 
inclusion bodies. A Sepharose 4B column (1.5 X 80 cm) was run in a solvent of 6 M GuHCI and 0.01 M NaOAc, pH 
.4.75. The protein solution was applied to the column at room temperature in 0.5-1.0 ml amounts. Fractions were col- 
lected and precipitated with cold ethanol. These were run on SDS gels, and fractions rich in the recombinant protein 
(approximately 34,000 D) were pooled. This offers a simple first step for cleaning up inclusion body preparations without 
suffering significant proteolytic degradation. 

[0127] For refolding, the protein was dialyzed against 100 ml of the same GuHCI-Tris-BME solution, and dialysate 
was diluted 11-fold over two days to 0.55 M GuHCI, 0.01 M Tris, and 0.01 M BME. The dialysis sacks were then 
transferred to 0.01 M NaCl, and the protein was dialyzed exhaustively before being assayed by RIA's for binding of 125 l- 
labelled digoxin. The refolding procedure can be simplified by making a rapid dilution with water to reduce the GuHCI 
concentration to 1 .1 M, and then dialyzing against phosphate buffered saline (0.15 M NaCl, 0.05 M potassium phos- 
phate, pH 7, containing 0.03% NaN 3 ), so that it is free of any GuHCI within 1 2 hours. Product of both types of preparation 
showed binding activity, as indicated in Figure 7A. 

Demonstration of Bifunctionality : 

[0128] This protein with an FB leader and a fused BABS is bifunctional; the BABS can bind the antigen and the FB 
can bind the Fc regions of immunoglobulins. To demonstrate this dual and simulataneous activity several radioimmu- 
noassays were performed. 

[0129] Properties of the binding site were probed by a modification of an assay developed by Mudgett-Hunter et al. 
(J. Immunol. (1982) 129:1165-1172; Molec. Immunol. (1985) 22:477-488), so that it could be run on microtiter plates 
as a solid phase sandwich assay. Binding data were collected using goat anti-murine Fab antisera (gAmFab) as the 
primary antibody that initially coats the wells of the plate. These are polyclonal antisera which recognize epitopes that 
appear to reside mostly on framework regions. The samples of interest are next added to the coated wells and incubated 
with the gAmFab, which binds species that exhibit appropriate antigenic sites. After washing away unbound protein, 
the wells are exposed to 125 l-labelled (radioiodinated) digoxin conjugates, either as i^l-dig-BSA or 125 l-dig-lysine. 
[0130] The data are plotted in Figure 7A, which shows the results of a dilution curve experiment in which the parent 
26-10 antibody was included as a control. The sites were probed with 125 l-dig-BSA as described above, with a series 
of dilutions prepared from initial stock solutions, including both the slowly refolded (1) and fast diluted/quickly refolded 
(2) single chain proteins. The parallelism between all three dilution curves indicates that gAmFab binding regions on 
the BABS molecule are essentially the same as on the Fv of authentic 26-1 0 antibody, i.e., the surface epitopes appear 
to be the same for both proteins. 

[0131] The sensitivity of these assays is such that binding affinity of the Fv for digoxin must be at least 10 6 . Exper- 
imental data on digoxin binding yielded binding constants in the range of 10 8 to 10 9 M" 1 . The parent 26-10 antibody 
has an affinity of 5.4 X 10 9 M* 1 . Inhibition assays also indicate the binding of 125 l-dig-lysine, and can be inhibited by 
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unlabelled digoxin, digoxigenin, digitoxin, digitoxigenin, gitoxin, acetyl strophanthidin, and ouabain in a way largely 
parallel to the parent 26-10 Fab. This indicates that the specificity of the biosynthetic protein is substantially identical 
to the original monoclonal. 

[0132] In a second type of assay, Digoxin-BSA is used to coat microtiter plates. Renatured BABS (FB-BABS) is 
5 added to the coated plates so that only molecules that have a competent binding site can stick to the plate. 1 25 l-labelled 
rabbit IgG (radioligand) is mixed with bound FB-BABS on the plates. Bound radioactivity reflects the interation of IgG 
with the FB domain of the BABS, and the specificity of this binding is demonstrated by its inhibition with increasing 
amounts of FB, Protein A, rabbit IgG, lgG2a, and lgG1, as shown in Figure 7B. 

[01 33] The following species were tested in order to demonstrate authentic binding: unlabelled rabbit IgG and lgG2a 
10 monoclonal antibody (which binds competiviely to the FB domain of the BABS); and protein A and FB (which bind 
competively to the radioligand). As shown in Figure 7B, these species are found to completely inhibit radioligand bind- 
ing, as expected. A monoclonal antibody of the lgG1 subclass binds poorly to the FB, as expected, inhibiting only about 
34% of the radioligand from binding. These data indicate that the BABS domain and the FB domain have independent 
activity. 

15 

IV. OTHER CONSTRUCTS 

[0134] Other BASS-containing protein expressible in E. coli and other host cells as described above are set forth in 
the drawing. These proteins may be bifunctional or multifunctional. Each construct includes a single chain BABS linked 

20 via a spacer sequence to an effector molecule comprising amino acids encoding a biologically active effector protein 
such as an enzyme, receptor, toxin, or growth factor. Some examples of such constructs shown in the drawing include 
proteins comprising epidermal growth factor (EGF) (Figure 15A), streptavidin (Figure 15B), tumor necrosis factor (TNF) 
(Figure 15C), calmodulin (Figure 15D) the beta chain of platelet derived growth factor (B-PDGF) (15E) ricin A (15F), 
interleukin 2 (15G) and FB dimer (15H). Each is used as a trailer and is connected to a preselected BABS via a spacer 

25 (Gly-Ser-GIy) encoded by DNA defining a BamHI restriction site. Additional amino acids may be added to the spacer 
for empirical refinement of the construct if necessary by opening up the Bam HI site and inserting an oligonucleotide 
of a desired length having BamHI sticky ends. Each gene also terminates with a Pstl site to facilitate insertion into a 
suitable expression vector. 

[0135] The BABS of the EGF and PDGF constructs may be, for example, specific for fibrin so that the EGF or PDGF 
30 is delivered to the site of a wound. The BABS for TNF and ricin A may be specific to a tumor antigen, e.g., CEA, to 
produce a construct useful in cancer therapy. The calmodulin construct binds radioactive ions and other metal ions. 
Its BABS may be specific, for example, to fibrin or a tumor antigen, so that it can be used as an imaging agent to locate 
a thrombus or tumor. The streptavadin construct binds with biotin with very high affinity. The biotin may be labeled with 
a remotely detectable ion for imaging purposes. Alternatively, the biotin may be immobilized on an affinity matrix or 
35 solid support. The BABS-streptavidin protein could then be bound to the matrix or support for affinity chromatography 
or solid phase immunoassay. The interleukin-2 construct could be linked, for example, to a BABS specific for a T-cell 
surface antigen. The FB-FB dimer binds to Fc, and could be used with a BABS in an immunoassay or affinity purification 
procedure linked to a solid phase through immobilized immunoglobulin. 

[01 36] Figure 1 4 exemplifies a multifunctional protein having an effector segment as a leader. It comprises an FB-FB 
40 dimer linked through its C-terminal via an Asp-Pro dipeptide to a BABS of choice. It functions in a way very similar to 
the construct of Fig. 15H. The dimer binds avidly to the Fc portion of immunoglobulin. This type of construct can 
accordingly also be used in affinity chromatography, solid phase immunoassay, and in therapeutic contexts where 
coupling of immunoglobulins to another epitope is desired. 

[01 37] In view of the foregoing, it should be apparent that the invention is unlimited with respect to the specific types 
45 single polypeptide chains and linkers as well as the types of BABS and effector proteins to be linked. Accordingly, other 
embodiments are within the following claims. 

[0138] Also contemplated is a biosynthetic binding protein expressed from DNA derived by recombinant techniques 

said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains connected 
50 by a polypeptide linker, the amino acid sequence of each of said polypeptide domains comprising a set of CDRs 

interposed between a set of FRs, each of which is respectively homologous with at least a portion of CDRs and 
FRs from an immunoglobulin molecule, 

said polypeptide linker comprising plural, peptide-bonded amino acids defining a polypeptide of a length sufficient 
to span the distance between the C-terminal end of one of said domains and the N-terminal end of the other of 
55 said domains when said binding protein assumes a conformation suitable for binding, and comprising hydrophilic 

amino acids which together assume an unstructured polypeptide configuration in aqueous solution, 
said binding protein being capable of binding to a preselected antigenic site, determined by the collective tertiary 
structure of said sets of CDRs held in proper conformation by said sets of FRs and said linker when disposed in 
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aqueous solution. 



Claims 

1 . A single polypeptide chain comprising a linking sequence of a length of at least 1 0 amino acid residues, the linking 
sequence connecting a first and a second non-naturally peptide-bonded, biologically active polypeptide domain 
to form a single polypeptide chain comprising at least two biologically active domains connected by said linking 
sequence, said linking sequence comprising hydrophilic peptide-bonded amino acids exhibiting small and unre- 
active side chains but no cysteine, the hydrophilic amino acids constituting a hydrophilic sequence having a flexible 
unstructured configuration essentially free of secondary structure in aqueous solution, the linking sequence having 
a plurality of glycine or serine residues and spanning the distance between the C-terminal end of the first domain 
and the N-terminal end of the second domain. 

2. The polypeptide chain of claim 1 , wherein said linking sequence comprises threonine. 

3. The polypeptide chain of claim 1 or 2, further comprising said first domain connected by a peptide bond to said N- 
terminal end of said linking sequence and a second domain connected by a peptide bond to the C-terminal end 
of said linking sequence. 

4. The polypeptide chain of claim 1 , wherein said linking sequence comprises plural consecutive copies of an amino 
acid sequence. 

5. The polypeptide chain of claim 4, comprising the amino acid sequence (GlyGlyGlyGlySer^. 

6. The polypeptide chain of claim 1 , wherein said linking sequence comprises one or a pair of amino acid sequences 
recognizable by a site-specific cleavage agent. 

7. DNA encoding the polypeptide chain of any of the preceding claims. 

8. A polypeptide linker, the linker having a length of at least 10 amino acid residues and linking two non-naturally 
linked polypeptide domains to form a multifunctional protein, said linker exhibiting amino acids with small and 
unreactive side chains and comprising plural hydrophilic peptide-bonded amino acids constituting a hydrophilic 
sequence, said linker spanning the distance between the C-terminal end of a first domain and the N-terminal end 
of a second domain, wherein each said domain comprises a biologically active polypeptide having a conformation 
suitable for biological activity independent of the biological activity of the other domain. 

9. A polypeptide linker, the linker having a length of at least 10 amino acid residues and linking two non-naturally 
linked polypeptide domains to form a functional protein, said linker exhibiting amino acids with small and unreactive 
side chains and comprising plural hydrophilic peptide-bonded amino acids constituting a hydrophilic sequence, 
said linker spanning the distance between the C-terminal end of a first domain and the N-terminal end of a second 
domain, wherein said domains together comprise an immunologically reactive binding site specific for a preselected 
antigen. 

1 0. The polypeptide linker of claim 9, wherein said two domains mimic a v H and v L chain from a natural immunoglobulin. 

11. The polypeptide linker of claim 8 or 9, which linker 

(a) comprises threonine, or 

(b) is cysteine-free, or 

(c) comprises a plurality of glycine or serine residues, or 

(d) comprises plural consecutive copies of an amino acid sequence, or 

(e) spans a distance of at least 4 nm (40 Angstroms), or 

(f) comprises the amino acid sequence GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGly GlySer, or 

(g) comprises one amino acid sequence or a pair of amino acid sequences recognizable by a site-specific 
cleavage agent. 

12. The polypeptide linker of claim 8, wherein at least one of said domains comprises an enzyme, a toxin, a receptor, 
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a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation factor, a lymphokine, a 
cytokine, a hormone, a remotely detectable moiety, or an anti-metabolite. 

13. The polypeptide linker of claim 8, wherein said first domain comprises a single chain binding site and said second 
5 domain comprises an enzyme, a toxin, a receptor, a binding site, a biosynthetic antibody binding site, a growth 

factor, a cell-differentiation factor, a lymphokine, a cytokine, a hormone, or an anti-metabolite. 

14. The polypeptide linker of claim 8, wherein at least one of said domains comprises a polypeptide capable of se- 
questering an ion. 

15. The polypeptide linker of claim 14, wherein the polypeptide comprises calmodulin, methallothionein, a fragment 
thereof, or an amino acid sequence rich in at least one of glutamic acid, aspartic acid, lysine, and arginine. 

16. The polypeptide linker of claim 8 or 9, wherein the amino acids of said linker together assume an unstructured 
15 polypeptide configuration in aqueous solution. 

17. DNA encoding the polypeptide linker of claim 8 or 9. 

18. Host cell transformed with and capable of expressing the DNA of claim 17. 

20 

Patentanspruche 

1. Eine einzelne Polypeptid-Kette, umfassend eine verbindende Sequenz mit einer Lange von wenigstens 10 Ami- 
25 nosaure-Resten, wobei die verbindende Sequenz eine erste und zweite nicht-naturliche peptid-gebundene biolo- 

gisch aktive Polypeptid-Domane unter Bildung einer einzelnen Polypeptid-Kette verbindet, wobei die einzelne 
Polypeptid-Kette wenigstens zwei biologisch aktive Domanen, verknupft uber die verbindende Sequenz, umfaBt, 
wobei die verbindende Sequenz hydrophile peptid-gebundene Aminosauren mit kleinen und nicht-reaktiven Sei- 
tenketten, aber kein Cystein umfaBt, und wobei die hydrophilen Aminosauren eine hydrophile Sequenz mit einer 
30 flexiblen unstrukturierten Konfiguration, die in waBriger Losung im Wesentlichen frei von Sekundarstrukturen ist, 

ausbilden, und wobei die verbindende Sequenz eine Vielzahl von Glyzin- oder Serin-Resten enthalt und den Ab- 
stand zwischen dem C-terminalen Ende der ersten Domane und dem N-terminalen Ende der zweiten Domane 
uberbruckt. 

35 2. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz Threonin umfaBt. 

3. Die Polypeptid-Kette nach Anspruch 1 oder 2, weiterhin umfassend die erste Domane, mittels Peptid-Bindung 
verknupft mit dem N-terminalen Ende der verbindenden Sequenz, und eine zweite Domane, verknupft mittels 
Peptid-Bindung mit dem C-terminalen Ende der verbindenden Sequenz. 

40 

4. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz eine Vielzahl aufeinanderfolgender Kopien 
einer Aminosaure-Sequenz umfaBt. 



5. Die Polypeptid-Kette nach Anspruch 4, umfassend die Aminosaure-Sequenz (GlyGlyGlyGlySer) 3 . 

45 

6. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz eine oder ein Paar von Aminosaure-Se- 
quenzen umfaBt, die von einem ortsspezifischen Spaltungsmittel erkannt wird/werden. 

7. DNA, kodierend fur die Polypeptid-Kette nach einem der vorhergehenden Anspruche. 

50 

8. Ein Polypeptid-Linker mit einer Lange von wenigstens zehn Aminosaure-Resten, der zwei naturlicherweise nicht 
verbundene Polypeptid-Domanen unter Bildung eines multifunktionalen Proteins miteinander verbindet und Ami- 
nosauren mit kleinen und nicht-reaktiven Seitenketten sowie mehrere hydrophile Peptid-gebundene Aminosauren, 
die eine hydrophile Sequenz ausbilden, umfaBt, wobei der Linker den Abstand zwischen dem C-terminalen Ende 

55 einer ersten Domane und dem N-terminalen Ende einer zweiten Domane uberbruckt, und wobei jede dieser beiden 

Domanen ein biologisch aktives Polypeptid mit einer fur eine biologische Aktivitat unabhangig von der biologischen 
Aktivitat der anderen Domane geeignete Konformation aufweist. 
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9. Ein Polypeptid-Linker mit einer Lange von wenigstens zehn Aminosaure-Resten, der zwei naturlicherweise nicht 
verbundene Polypeptid-Domanen unter Bildung eines funktionalen Proteins miteinander verbindet und Aminosau- 
ren mit kleinen und nicht-reaktiven Seitenketten sowie mehrere hydrophile Peptid-gebundene Aminosauren, die 
eine hydrophile Sequenz ausbilden, umfa3t, wobei der Linker den Abstand zwischen dem C-terminalen Ende einer 

5 ersten Domane und dem N-terminalen Ende einer zweiten Domane uberbruckt, und wobei die Domanen gemein- 

sam eine immunologisch reaktive Bindungsstelle, spezifisch fur ein bestimmtes Antigen, umfassen. 

10. Der Polypeptid-Linker nach Anspruch 9, wobei die beiden Domanen eine V H - bzw. eine V L -Kette eines naturlichen 
Immunglobulins imitieren. 

10 

11. Der Polypeptid-Linker nach Anspruch 8 Oder 9, wobei der Linker 

(a) Threonin umfaBt, oder 

(b) frei von Cystein ist, oder 

15 (c) eine Vielzahl von Glycin-oder Serin-Resten umfaBt, oder 

(d) mehrere aufeinander folgende Kopien einer Aminosaure-Sequenz umfaBt, oder 

(e) einen Abstand von wenigstens 4 nm (40 Angstrom) uberbruckt, oder 

(f) die Aminosaure-Sequenz GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyGlySer umfaBt, oder 

(g) eine Aminosaure-Sequenz oder ein Paar von Aminosaure-Sequenzen, die von einem ortspezifischen Spal- 
20 tungsmittel erkannt wird/werden, umfaBt. 

12. Der Polypeptid-Linker nach Anspruch 8, wobei mindestens eine der beiden Domanen ein Enzym, ein Toxin, einen 
Rezepton eine Bindungsstelle, eine Bindungsstelle eines biosynthetischen Antikorpers, einen Wachstumsfaktor, 
einen Zelldifferenzierungsfaktor, ein Lymphokin, ein Cytokin, ein Hormon, eine indirekt nachweisbare Einheit oder 

25 einen Anti-Metaboliten umfaBt. 

13. Der Polypeptid-Linker nach Anspruch 8, wobei die erste Domane eine Einzelketten-Bindungsstelle und die zweite 
Domane ein Enzym, ein Toxin, einen Rezeptor, eine Bindungsstelle, eine Bindungsstelle eines biosynthetischen 
Antikorpers, einen Wachstumsfaktor, einen Zelldifferenzierungsfaktor, ein Lymphokin, ein Cytokin, ein Hormon 

30 oder einen Anti-Metaboliten umfassen. 

14. Der Polypeptid-Linker nach Anspruch 8, wobei mindestens eine der beiden Domanen ein Polypeptid umfaBt, das 
ein Ion maskieren kann. 

35 15. Der Polypeptid-Linker nach Anspruch 14, wobei das Polypeptid Calmodulin, Methallothionein, ein Fragment davon 
oder eine Aminosaure-Sequenz umfaBt, die reich ist an zumindest einer der Aminosauren Glutaminsaure, Aspa- 
raginsaure, Lysin und Arginin. 

16. Der Polypeptid-Linker nach Anspruch 8 oder 9, wobei die Aminosauren des Linkers in waBriger Losung gemeinsam 
40 eine unstrukturierte Polypeptid- Konfiguration annehmen. 

17. DNA, kodierend fur den Polypeptid-Linker nach Anspruch 8 oder 9. 

18. Wirtszelle, transformiert mit und befahigt zur Expression der DNA nach Anspruch 17. 

45 

Revendications 

1. Une chaTne polypeptidique unique comprenant une sequence de liaison d'une longueur de 10 residus d'acides 
50 amines au moins, la sequence de liaison reliant un premier et un deuxieme domaines polypeptidiques biologique- 

ment actifs non lies de fagon peptidique a Petat naturel, de maniere a former une chaTne polypeptidique unique 
comprenant au moins deux domaines biologiquement actifs relies par ladite sequence de liaison, ladite sequence 
de liaison comprenant des acides amines hydrophiles lies de fagon peptidique presentant des chaines laterales 
courtes et non reactives mais aucune cysteine, les acides amines hydrophiles constituant une sequence hydrophile 
55 presentant une configuration non structuree flexible essentiellement depourvue de structures secondaires en so- 

lution aqueuse, la sequence de liaison ayant une pluralite de residus glycine ou serine et couvrant la distance 
entre I'extremite C-terminale du premier domaine et I'extremite N-terminale du second domaine. 
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2. La chaine polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
de la threonine. 

3. La chaine polypeptidique selon la revendication 1 ou 2, comprenant en outre ledit premier domaine lie par une 
5 liaison peptidique a ladite extremite N-terminale de ladite sequence de liaison et un second domaine lie par une 

liaison peptidique a I'extrSmite C-terminale de ladite sequence de liaison. 

4. La chaine polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
plusieurs copies consecutives d'une sequence en acides amines. 

5. La chaine polypeptidique selon la revendication 4, comprenant la sequence d'acides amines (GlyGlyGlyGlySer) 3 . 

6. La chaine polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
une sequence ou une paire de sequences d'acides amines reconnaissable(s) par un agent de clivage site-speci- 

15 fjque. 

7. ADN codant pour la chaine polypeptidique selon Tune quelconque des revendications precedentes. 

8. Un agent de couplage polypeptidique, I'agent ayant une longueur d'au moins 1 0 residus d'acides amines et reliant 
20 deux domaines polypeptidiques non naturellement lies, de maniere a former une proteine multi-fonctionnelle, ledit 

agent presentant des acides amines avec des chaines laterales courtes et non reactives et comprenant plusieurs 
acides amines hydrophiles lies de fagon peptidique constituant une sequence hydrophile, ledit agent couvrant la 
distance entre I'extremite C-terminale d'un premier domaine et I'extremite N-terminale d'un second domaine, cha- 
cun desdits domaines comprenant un polypeptide actif sur le plan biologique presentant une conformation adaptee 
25 pour une activite biologique independante de I'activite biologique de I'autre domaine. 

9. Un agent de couplage polypeptidique, I'agent ayant une longueur d'au moins 1 0 residus d'acides amines et reliant 
deux domaines polypeptidiques non naturellement lies, de maniere a former une proteine fonctionnelle, ledit agent 
presentant des acides amines avec des chaines laterales courtes et non reactives et comprenant plusieurs acides 

30 amines hydrophiles lies de facon peptidique constituant une sequence hydrophile, ledit agent couvrant la distance 

entre I'extremite C-terminale d'un premier domaine et I'extremite N-terminale d'un second domaine, lesdits domai- 
nes comprenant ensemble un site de liaison reactif sur le plan immunologique et specifique pour un antigene pre- 
selectionne. 

35 1 o. L'agent polypeptidique selon la revendication 9, caracterise en ce que lesdits deux domaines miment une chaine 
V H et V L d'une immunoglobuline naturelle. 

11. L'agent polypeptidique selon la revendication 8 ou 9, ledit agent 

40 (a) comprend de la threonine, ou 

(b) est depourvu de cysteine, ou 

(c) comprend plusieurs residus glycine ou serine, ou 

(d) comprend plusieurs copies consecutives d'une sequence d'acides amines, ou 

(e) couvre une distance d'au moins 4 nm (40 Angstroms), ou 

45 (f) comprend la sequences d'acides amines GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyGlySer, ou 

(g) comprend une sequence d'acides amines ou une paire de sequences d'acides amines reconnaissable(s) 
par un agent de clivage site-specifique. 

12. L'agent polypeptidique selon la revendication 8, caracterise en ce qu'au moins Tun desdits domaines comprend, 
so un enzyme, une toxine, un recepteur, un site de liaison, un site de liaison a un anticorps biosynthetique, un facteur 

de croissance, un facteur de differentiation cellulaire, une lymphokine, une cytokine, une hormone, une portion 
detectable amovible, ou un anti-metabolite. 

13. La liaison polypeptidique selon la revendication 8, caracterisee en ce que ledit premier domaine comprend un 
55 site de liaison a une chaine unique et ledit second domaine comprend un enzyme, une toxine, un recepteur, un 

site de liaison, un site de liaison biosynthetique d'anticorps, un facteur de croissance, un facteur de differentiation 
cellulaire, une lymphokine, une cytokine, une hormone, ou un anti-metabolite. 
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14. L'agent polypeptidique selon la revendication 8, caracterise en ce qu'au moins Tun desdits domaines comprend 
un polypeptide capable de sequestrer un ion. 

15. L'agent polypeptidique selon la revendication 14, caracterise en ce que le polypeptide comprend la calmoduline, 
5 la methallothioneine, un fragment de celles-ci, ou une sequence d'acides amines riche en Tun au moins des acides 

amines acide glutamique, acide aspartique, lysine et arginine. 

1 6. L'agent polypeptidique selon la revendication 8 ou 9, caracterise en ce que les acides amines dudit agent prennent 
ensemble une configuration polypeptidique non structuree en solution aqueuse. 

10 

17. ADN codant pour l'agent polypeptidique selon la revendication 8 ou 9. 

18. Cellule note transformee avec et capable d'exprimer I'ADN selon la revendication 17. 

15 
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10 20 30 «0 90 60 70 

OAATTCCAACTTCAACTCCACCACTCTCCTCCTCAATTCCTTAAACCTOCCCCCTCTCTCCCCATGTCCT 
ClufhtCluV«XClnL«uClnCXnS«rClyfroQauLtuV»ltyf ProGX y AXaStr Vi X ArgHttSar C 

AauXX 8bvX Atoll AhaXX HhaX 

ecoll rnulHX Sau96X fttaX Hint! 

TaaX PatX EeolIX MitlKXaXXX 

Hit 1 1 FapX 
HhaX 
HlttfX 
Karl 
SXaXV 
SerFX 
AcyX 

80 90 100 110 120 130 190 

CCAAATCCTCTCCCTACATTTTCACCCACTTCTACATCAATTCCCTTCOCCACTCTC ATCCTAACTCTCT 
v^LvaS^rSerCXvTYrXXePheThr AapPheTyrHetAan TrpVaXArKOXnSerMlaOXyLysSerte 
ftaaX HpM NXalXI BatXI NXaXIX Xba 

MS 

150 160 170 1B0 190 200 210 

AOICTACATCCCCTACATTTCCCCATACTCTCCCCTTACCCCCTACAACCACAACTTTAAACCTAACOCC 
uAaaTvrXleCXv Tyr IX eSer ProTyrSerGly V»X Thr 01 yTy r Asn Gl n ly 3 Phc Lys OX yLya AXa 
X RaaX BatCXl ~ Prat 

aX HpaXI 

HaaXXX 

220 230 240 250 260 270 2S0 

ACCCTTACTCTCCACAAATCTTCCTCAACTCCTTACATCCAOCTCCCTTCTTTCACCTCTCACCACTCCC 
ThrLettThrYalAspLysSerSerSerThrAX*TyrMttCXuL««ArgScrU«uThr$«r0XuA4pS«rA 
AeoX HboIX AXuX Odd HlnfXFn 

HtncXI HXaXXXBpvX Sac 

SalX FnuHHX 
Ta«X 

290 300 310 320 330 3«0 350 

CCCTATACTATTCeCCCCCC7CCTCTCCTAACAAAT0C0CCATCCATTACTCCC0TCATCCC0CCTCT0T 
lmtf^iTvFTvrCvaAXaCXySar SerCXyAanLyaTrpAXiMet AapTyrTrpCXyHlaOXyAXaStf Va 
uOXX HhaX BantX MaaXXX HaaXXX AhaXI Ma 

XXAccX FnuSTI Hool 

HlnPXRXaXV NlaXXX 

3au96X 
StyX 

» 

360 370 
TACTCTATCCTCATACCATCC 
XThrViXSerSeMaaAap 
«IXX BasH 1 

Sau3A rA.«. 
XhoXX 
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to 20 30* *0 50 60 TO 

CAATTCCACCTCCTAATCACCCACACTCCCCTCTCTCTCCCCCTTTCTCTCCCTCACCACCCTTCTATTT 
CluPhtA«pValv»XMttThrClnTnr ProL«uStrLeuProV*lStrL«uClyA»pCinAiaStf IX tS 

EcoRX AatXX HlnfX Rptll 5 st "| 

AhmlI HphX EcoRXX 

T*qx ScrFI 
Acyl 

60 90 100 110 . 120 130 1«0 

CTTCCCCCTCTTCCCACTCTCTCCTCCATTCTAATCCTAACACTTACCTCAACTCCTACCTCCAAAACCC 
erCys ArgSerSerCXnSerteuy lHlaSer AanCXyAanThrTyr teuA jntrpTyrLeuCXnLy»AX 
Fnu«HX~ Avail HlilXX RiXEXI BanX 

HfcoII BatXX Kpnl 

S *u96 X HXalY 

Raal 

ISO 160 170 160 190 200 210 

TCCTCACTCTCCCAACCTTCTCATCTACAAAOTCTCTAACCCCTTCTCTCOTCTCCCCCATCCTTTCTCT 
AeiY&ln&>rPpaLvaLaufuIIcTYr Ly3V>XSerAsnArgPhaScr CXyV>XPPQAapArgPh»ST 
AXuX S*u 3 A HpaXX 
HlnflllX MciXSau3A 

scrrx 

220 230 2<0 250 260 270 260 

CCTTCTCCTTCTCCTACTCACTTCACCCTCAAOATCTCTCCTCTCCACCCCCACCATCTCCCTATCTACT 
GXySarGXySarGXyThrAapPheThrLtuLytXXtSarArgValCXuAXaGXuAapLauGXyXXaTyrP 
Rsal KphX BglXX TaqXHaeXXX Sau3A 

MtoIX XhoXI 
Sau3A 
XhoXX 

290 300 310 320 330 3*0 350 

TCTCCTCTCACACTACTCATCTACCCCCCACCTTCCCCCCTCCCACCAACCTCCACATCAAACCTTCACCATCC 
ntCyaSTCXn ThfThrHlaVaXProPr& ThrPhtGXyCXyGXyTnrtysLeuCXttXXeLyaAf <*op 

Ddel NXAXXX ' KflEXX BanX AXuI SauJA HaaXI BaaKX 

Raal HXaXV AvaX MXaXV 

TaqI Sau3A 
XhoX XhPlX 
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10 20 30 «0 50 60 70 

GAATTCGAAGTTCAAC7CCAGCAGTCTGGTCGTGAATTGGTTAAACCTGGCCCCTCTGTGCCCATGTCCT 
GluPfiaGluValGlnltuGXnGXnStrClyProGluLauVaHyaProCXyAlaSar VaXAr tMttSarC 

AauXX BbvX A*aXX AhaXX HhaX 

EeoRI rnu*HX Sau96l Ban! HlnPX 

TaqX PatX EcoBXX HatXHlaXXX 

HatXX f»pl 
HhaX 
HlnPI 

m*xv 

AcyX 

80 90 100 110 120 130 140 

GCAA ATCC7CTGGG TACATTTTCACC AATTACTAC ATCC ATTGGGTTCGCCAGTCTCA TGGTAACTCTCT 
CATG7AA aA 6TCCTTAAT(?ATGtA6GTA ACCC A AGCGGTC 

yaLyaSerSerClyTyrXlePheTnrAanTyrTyrlleHisTrpValArgClnSerHisClyLyaSerUe 
Raal Hphl FokX BatXI NlaXIX Xba 

KB 

150 160 170 180 190 200 210 

ACACTAC ATCGG0TGGATC7ACCCCGGTAATGGTAACACTAAGTACT-ACAATGAGAACTTT AAAGGTAAC 

TGATGTCTCCC ACCTAGATGGGCCC ATTACC ATTGTGATTC ATGATGTTACTCT7GA A A 
uAapTyrlleGlyTrplleTyrProCiyAanGiyAanThPLyaTyrTyPAsnCiuAanPncLysCXyLya 
X Btu3A Aval MaaXXXDdcIRsaX OraX 

•X XhoXX HpaXX Seal 

Neil 

HeiX 
SaaX 
XaaX . 

220 230 240 250 260 270 280 

GCGACCCTTACTGTCGACAAATCTTCCTCAACTGCTTACATCGAGCTGCG7TCTTTCACCTCTCAG6ACT 
AlaThrLeuThrValAsptysSerStrSerThrAiaTyrHatGXuLauArgStrUeuThrSerGXuAapS 
AeeX MboXX AluX DfleX Hlnf 

Hindi NlaXIXBbvX 
SalX Fnu4HX 
TaqX 

290 300 310 320 330 340 350 

CCGCGGTATACTATT6CGCGGCCTCCTCTG0TAACAAAT GG0CCTTCCATTACT00OGTCATOG CCCCTC 

GGAAGCTA ATGACCCC AGTA CCCC 

arAlavalTyrTyrCysAXaGlySerSarCXyAanLysTrpAlaPheAspTyrTppGlyMiaGiyAiaSa 
X AecX KbaXBanXX KaaXXX HaeXXX AhalX 

FnuOXX PnuOXX 5au96XTaqX BanX 

SaeXX HlnPXNlaXV — HaaXX 

HhaX 
HinPl 

360 370 HarX 

TCTTACTGTATCCTCATAGGATCC NlaXXX . 

rVaXThrValSerSer'aa HXaXV 
HaelXX BaaHl AcyX 

•KI. Fife. MO 

XhoXX 
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10 * 20 30 *0 50 60 70 

CAATTCGACCTCGTAATCACCCAGACTCCCCTGTCTCT0CCCGT7TCTCTGGGTCACCAGCCTTCTATTT 
CXuPh*A»^V*lY*XM«tThrCinThrProLtuStrt«ufroValStrLtuGXyA»pClnA;aStrIl»S 
EcoRl Attn HinfX HpoXX BOUXX 

AluXX **** «fi" X 

AcyX HMXXX 
NfttlX 

60 90 100 110 120 130 1*0 

CT TCCCCCTCTTCCCACTCTATTOTOCACTCTAATCCTAACACTTACCTCCATTCCTAC CTCCAAAACCC 
"a A &dflcdAGAAGGGTCAGAT A AC ACGTG AC ATT ACCATTG TG A ATOGACCTA AC 
» f T yaAr8&er^erGinS»riX»V«lk laSerAaffGlyAanrhrTyri,tuA3pTrpTyrLettGXnLyaAX 
Fnu*HI HglAI M««XII EeoRXX BtflX 

weoll Scrfl Kpnx 

KglEXX HXiXV 

RftftX 

150 160 170 100 190 200 210 

TCCTCACTC7CCGAAGCT7CTCATCTACAAAGTCTCTAACCCCTTCTCTCGTCTCCCGCATCCTTTCTCT 
AGXyCXnS«rProLysUeuteuIX€TyrUyaV*lS«rA»nArgPneS»rGXyV4XProA3pAr8Fh#S«r 
AXuX S*u3A Hp»IX 
HlftdXXX MciXS*u3A 

SerfX 

220 230 210 250 260 270 200 

GGTTCTCCT7CTCCTACTGACTTCACCCTCAAGATCTCTCCTGTCCAGGCC^ 

GXySerGXySerCXyThrAapPheThfLeuLy3lX«StrArg\r»XGXuAXAGXuAapt«uGXyXXiTyrT 
Rati HphX Bglll T4qX{UeXIJC S*u3A 

HboXX XhoXX 
SAUlA 
XhoXX 

290 300 310 320 330 3«0 350 

ACTGCTTCCAGGGGTCTCATGTACCCTGGACCTTCGGCGGTCGCACCAAGC TCGAGATCAAACGTTCAGGATCC 
TCACflAAGdTCCCCAGAGTAC ATGGC ACCTGOAAGCCGCCACCCTGGTTCGAOCT 
yr'cyaPneGXnGlySerMiaVaXProTrpTnrPfttGXyGXyGXyThrtyaLauGiuXX^LysArf^op 

EcoRIX NXaXXX AyoXX B*nl Alul S*u3A MitXI 8**HX 

Serrx R»*I 5AU96X NXaXV AvoX HXtXV 

HgiEXI T*qX 
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10 20 30 40 50 60 70 

GAAT7CA76GAAG7ACAAC7GCAACAATC7GG0CCCGG7C7GCTACG7CCC7C7CAGAC7C7C7GCCTGA 
0XuFh«N«tC2u?alGlnL«uClnGXnS«rGXyProClyLtuValArsProStrCln7hrt«uS«rLaur 
CeoBIHXaXXX laaX ApaXHpaXI Baal ©dalKlnfl 

BanTX MatXI TthUll 

HatXXX 
ieil 
ilaXV 
5*u96I 
Sau96X 
Serf I 

60 90 100 110 120 ^30 140 

C77G7ACCG7ATCCCGA7CCACC77C7CTAACTAC7ACA7CCA77GGG7CCG7CAACCCCCCGC7CG7CG 
firCvaThr VaXSerGXySt r Thr PheSarAan TyrTy r IXeHta Trp VaX ArgGXn Pro ProGX yArgGX 
Rati BamMJ Fokl AvatX HlncXI HpaXX 

HpaTx HlaXV HcXX 

NXaXV Sau96! Scrfl 

Sau3A 
XholX 

150 160 170 160 190 200 210 

TC7CCAG7CGA7C CC77GGA777ACCCCCG7AATGC7AAC ACTA ACT ACTA C A A7GACAAC777AAACGC 
vLauCluTroXXtOXy TrpXleTyrProGlyAanGlyAanThrLyaTyrTyrAanGluAanPhcLyaGXy 
Aval Sau3A Aval Hat IX XPde Iftaa X OraX N 

TaqI HpalX Seal Sp 

XhoX Hell 

Nell 
SerFT 

ScrFI 
Seal 
Xaal 

220 230 240 250 260 270 280 

A76C7C67CCACAC77C7AAGAACC AA77C7C7C7CCG7C7G7C77C7C77ACCGCCCC7CATAC7CC7G 
Met Utu YaXAapThrSerlys AanGXnPheStr leu Art, Leu SerSer VaXThrAlaAlaAapThrAXaV 
XaXXX AceX OdaXXanI HgaX MboII MaeIXXFnu«HI 

hi HlneXX BbvlX FnuDtX 

SaXX SaeXX 

"TaoJX 

290 300 310 320 330 3<0 350 

7G7AC7AC7GCCCGCG77CC7CCGG7AA7AAG7GGCCA777GA77AC7GCGCCC AGGCC7C7C7GG7CAC 
■ITyrTyrCvaAXaAr a ScrSerGXyAsnlyaTrpAlaPheAapTyrTrpGly GXnOXySerlou VaX7h 
BaaX BssHXX HpaXX HXaXV Ban 1 1 BatEII 

FnuOXX UoftXX , MphX 

FnuOXX HaeXXX HaeXXX 

HhaX Sau96l 

Hhal ScrFI 
HlnPI 

HlnPX 4£ 

360 370 
CGTA7CCTC7TAAC7GCAG 
rVaXSerStr'oettuGXh 
Pat I 
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!0 20 30 «0 SO 60 70 

GAAT7CA7CCAA7C7C77CTCACTCAGCCCCCC7C7CTA7CTCGTGCACCGCGTCAACGCC7AACTA7CT 
CXuPhtH«tCXuS#rVtlLtuThpClnProProS«rV*lSerClyAUProClyClnAr|V»lThrIltS 
EcoIX HUM OdalFnuWI HflAIHpaXX FnuOXX 

KXaXXX HlnfX HclIHineXt MatXXX 

Xmnl ScrFX HXuX 

60 90 100 110 120 130 190 

C7TGCCC77CC7CTCAG7C7AT7G7CCA77C7AA7GCCAAC AC77A7C7GGAA7GGTACCAACAAC7CCC 
cr Cy a Ar«SerSerGXnS«r IleVaXHlaScr AsnGlyAanThr Tyr LcvGl u TrpTyrGI nGlnLau Pr 
Ode I BatXX 6anl Hp 

KpwX No 
HXa I V Se 

Mill 

150 160 170 100 190 200 210 

GCCCACCGCCCCGAAGC7GC7CA7C777AAAGTA7C7AATCCC77CTC7GGCG7ACCCGATCCATTCTCT 
o0XyThrAlaProLyaLtuttuIltPht LysVtlSerA3nArtPheSerCly V»lPpoAapArRPh»3tr 
♦ II FfiuOXX AXviI Oral B»«I CXal 

IX HhaX BbvX Sau3A HpaXI HUfX 

rFX HlnPX FnuaHX S»u*A 
Ban I Tmql 

mazv 

220 230 240 250 260 270 260 

CTATC7AAC7C7CCCTCC7CTCCCACTCTCCCCATCAC7CCTCTCCAACCAGAACATCACCCCCA77ACT 
VaXSerlyaSarCXyStr S«r AlaThr LtuAXa HeThrGXy LeuCXn AlaCiu AapCXuAXa AspTyrT 
D4«X HlaXV Bill Sau3A MboXX HaetXX 

290 300 310 320 330 3*0 350 

AC7C7TTTCAAGGCTCTC ATGTACCCTGGACCT7CGCTGGTGGCACCA AGCT7ACTGTAC7GCGTCAGCC 
yrCya PheOXnGXyStr Hl3ValProTrpThrPheGly 6XyCXy7hf lyaLeuThr mituArgGlnPr 
Hlalll AvalX Ban I AXuX HaaX HgaX 

UaX 5au96I NXalV HlndXXX 

HglEXX 

360 

CTAACTOCAC f| IT UP 

o^ocLauCXn riWl. i 

PatX 
MatXXX 
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XO 20 30 40 50 CO 70 

CAAGTTCAACTGCAGCACTCTCCTCCTCAATTCCTrAAACCTOGCCCCTCTGTCCC^^ 

SVOlQOSCPELVXPGASVJtMSCKSS 
BftvX* Avon AhoXX MhBX Mnll* 

FAU4H2 SiuUI BinXMnlX* HlnPX 

PStX CcoftXX PlpXHllXXX 

HlftXX W»pHX 
HhftX 
HlnPX 
NtrX 
HloXV 

scrrx 

X,, M-l , Xl 

I 85 95 105 115 | 125 135 145 

CCGTACCCCCAGTCTCATGCTAACTCTCTAGACTTTAAAGGTAACGCCACCCTTACTCTCCACAAATCTTCCTCA 
GYROSHGKSLOrKCKATLTVOKSSS 
BanX B»tXl NlaXXI XbaT AeoX H&oXX- 

KpnX Pr « T MineXX MnlX* 

mnv s*ix 

RsaX TaqX 

I X J I 

160 170 160 190 200 RIO | 220 

ACTGCTTACATGGAGCTCC CTTCTTT GACCTCTCAGGACTCCGCGCTATACTATTGCCCGCGTATCCATTA7TCG 
TAYMrLRSLTSrOSAVVYCARXOYW 
AluX DdiX HinfX ACCX AceXX < ;i^T HI 

NlaXXXBbvX- MnlX*MnlX- AceXX AeeXZ T»qX S 

rnu4Ki NtpBxx aiaflu 

soexx Knti 

HhoX 
HlnPX 
Hin?X 

«5 245 255 265 

CCCCATCCCCCTAGCCrrACCGTGACCTCCTAAGCATCC ^/fi. <* 

GHGASVTVSS'GS 
•XV Haoxx AluX DooXBamHX 

•u96X HhtX BtnXXMstXXNUXV 
HltXXX HlnPX BSP1206 Sau3A 

MeoX NhoX HgiAI XhoXX 

HlaXXX saex 
Styx 
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10 20 30 40 50 60 70 

GAATTCATCCCTCACAACAAATTCXACAACCXACACCACAACCCCTrCTACCACATCTTCCACCTCCCCAACCTC 

SFMAD NXFHXEQQNAFYE ZLHLPNL 
rcoRI MlttZ BglZZ BspMX* 

XmnX 

SS 95 105 115 . 125 135 145 

AACCAACACCACCGTAACCCCTTCATCCAAACCTTCAAACACCACCCCTCTCACACCGCTAACCTCCTCCCACAG 
NEEQRNGFZQSLXDDPSQSANLLAE 

HindZZZ BspMZ* 

EC047III 

160 170 180 190 200 210 220 

GCCAAGAAACTCAACGACGCTCAGCCGCCGAAGAGTGATCCCGAAGTTCAACTGCACCAGTCTGGTCCTGAATTG 
AXXLNDAQAPKSDPEVQLOQSCPEL 

Karl P»tl 

235 245 255 265 275 285 295 . 

GTTAAACCTGGCGCCTCTGTGCGCATGTCCTGCAAATCCTCTCGGTACATTTTCACCGACTTCTACATGAATTGG 
VXPCA SVRMSCXSSGYIFTDFYMNW 
Nftrl Fspl 

310 320 330 340 350 360 370 

GTTCGCCAGTCTCATGGTAAGTCTCTAGACTACATCGCGTACATTTCCCCATACTCTGGGGTTACCGGCTACAAC 
VRQSHGKSLDYZGYZSPYSGVTGYN 
BfitXI XbaZ PflMI BstEII 

385 395 405 415 425 435 445 

CAGAAGTTTAAAGCTAAGGCCACCCTTACTCTCGACAAATCTTCCrr^ 
QXFXGKATLTVDXSSSTAYMELRSL 

Oral Sail 

460 470 480 490 500 510 520 

ACCTCTGAGGACTCCCCGGTATACTATrGCGCGGGCTCCTCTCGTAACAAATGGGCCATGGATTATTGGGGTCAT 
TSEDSAVYYCAGSSGHXWAMDYWGH 
SacII Hcol 

535 545 555 565 575 585 595 

GGTGCTAGCGTTACTGTGAGCTCTGGTGGCGGTGGGTCGGGCGGTGGTGGCTCGGGTGGCGGCGGATCCGACGTC 
GASVTVSSGGGGSGGGG5GGGGS0V 
Khel Sad BanHI AatIZ 

610 620 630 640 650 660 670 

GTTGTTACCCAGACTCCGCTGTCrCTGCCGGTTTCTCTGGGTGACCACCCTrCTATTTCTTCCCGCTCTTCCCAG 
VVTQTPLSLPVSLGDQASZSCRSSQ 

BstEZZ PflM 

685 695 • 705 715 725 735 745 

TCTCTGGTCCATTCTAATGGTAACACTTACCTGAACTGGTACCTGCAAAAGGCTGGTCAGTCTCCGAAGCTTCT 
SLVKSNGNTYLNWYLQKAGQSPXLL 
I BatXZ BepKZ* HtndZZZ 

XpnZ 

FIG. 6A-X 
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T T — " 1 1 r 




M g OFDIGOXIN BINDING PROTEIN PER ml 



760 770 780 790 800 810 820 

ATCTACAAAGTCTCTAACCCCTTCTCTGGTCTCCCGCATCGTITCT^ 
IYKVSNRFSGVPDRFSGSGSGTDFT 



835 



845 



855 



865 



875 



885 



895 




910 920 930 940 

TTTGGTGGTCGCACCAAGCTCGACATrAAACGTTAACTGCAG 
FGGGTKLEXKR* 

XhoZ Hpal PstI 



FIG. 
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10 20 30 40 50 60 

6ATCCT6AC5TC6TAATGACCCASACTCCGCTBTCTCT0CCSGT7TCTCTO5676ACXAG 
OP0VVHT0TPLSLPV6L000 

70 60 90 100 110 120 

6CTTCTATTTCTTBCCBCTCTTCCCAGTCTCTGGTCCATTCTAATGGTAACACTTACCTG 
A B 1 SCR8BQ8LVMS NGNTVL 

PflHX B»tXX 

130 140 150 160 170 1B0 

AACTGGTACCTGCAAAAGGCTGGTCAGtCTCCGAAGCTTCTGATCTACAAASTCTCTAAC 
NWYL0KAGQ8PKLL1YKVSM 
8«pM* Hindi XX 

Kpnl . 9 

m 20O 210 220 230 24 0 

CGCTTC7CTGGTGTCCCGGATCGTTTCTCTGGTTCTGGTTCTGGTACTGACTTCACCCTG 
RFSGVPDRFBG8GSGTDFTL 

250 260 270 2B0 290 300 

AAGATCTCTCG7GTCGAGGCCGAAGACCTGG6TATCTACTTCT6CTCTCAGACTACTCAT 
K XSRVEAEDLGX YFCSQTTH 
BgXXX 

310 320 330 340 350 360 

6TACCGCCGACTTTT66TGGTGGCACCAAGCTCGAGATTAAACGTGGATC1SGAGGTG6C 
VPFTFGGGTKLEX KRGS6GG 

XhoX 

370 380 390 40O 410 420 

GGATCTG6T6GAGGTGGCTCTGGTGGCG5TGGATCCGAA5TTCAATTGCAGCA6TCTGGT 
GSGGGGSGGGGSEVOLOQSG 

BftftHX 

430 440 450 460 470 4B0 

CCTGAATTGGTTAAACCTGGCGCCTCTGTGCGCATGTCCTGCAAATCCTCTBGGTACA7T 
PELVKPGASVRnSCKSSGYX 
Narl F*pX 

«90 50O 510 520 530 540 

TTCACCGACTTCTACATGAATTGGGTTCGCCA6TCTCATGGTAAGTCTCTA6ACTACATC 
FTDFYflMWVftOSHGKSLOY I 

BstXI Xb*I 

550 560 570 * 5B0 590 600 

GGGTACATTTCtXCATACTCTGGGGTTACCGGCTACAACCAGAAGTTTAAASGTAAGGCG 
GY1 SPYSGVT6YN0KFKGKA 
PflHX BstEXI Dr*l • 

610 620 630 640 650 660 

ACCCTTACTGTCGACAAATCTTCCTCAACTGCTTACATGGAGCT6CGTTCTTTGACCTCT 
TLTVO KSSSTAYtlEURSUTS 
SalX 

670 680 690 700 7iO 720 

GAGGACTCC6C6GTATACTATTGCGCG6GCTCCTCTG6TAACAAAT6GGCCATGGATTAT 
EOSAVYYCAGSSGNICWAtlOY 
S«c 1 1 NcoX 

730 740 750 760 C I f «• (dB 

TGSGGTCATGGTGCTAGCGTTACTGTGASCTCTTAACTGCAG r 1 v 
W8H8A8VTVB8* 
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PICn. >B 
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10 20 30 40 50 60 

CAACTTCAACTCCACCACTCTCCTCCTCCATTGCTrCGACCTTCCCACACTCTCTCCCTC 
EVQLEQSGPGLVRPSQTLSL 

70 80 90 100 110 120 

ACCTGCACATCCTCTGGGTACATTTTCACCGACTTCTACATGAATTGCGTTCCCCACCCT 

TCTS5GYXFTDFYMNWVRQP 
BapMX+ BstXI 

130 140 130 160 170 180 

CCTGGTCGCGGTCTAGACTACATCCGGTACATTTCCCCATACTCTGGG6TTACCGGCTAC 
PGRGLDYXGY X5 FYS GVTGY 
Xbal PflKI BltEXX 

190 200 210 220 230 240 

AACCAGAAGTTTAAAGGTAAGGCGACCCTTCTGGTCAACAAATCTAAGAACCACCCTTCC 
NQXFKGXATLLVNKSKH QAS 
Oral 

250 260 270 280 290 300 

CTGCGGCTGTCTTCTGTGACCGCTGCGGACACCGCCGTATACTATTGCCCGCCCTCCTCT 
tRLSSVTAADTAVYYCAGSS 

SaeXX 

310 320 330 340 350 360 

GGTAACAAATCGGCCATGCATTATrGGGGTCAGCCTTCTCTGGTTACTCTCAGCTCTGGT 
GHKWAMDYWGQGSLVTV SSG 
NeoX Sad 

370 380 390 400 410 420 

GGCGSTGGGTCGGGCGGTGGTGGCTCGGGTGGCCGCGGATCCGACGTCGTTATGACCCAG 
CCG SGCGGSGGCGSDVVMTQ 

Ba&HX AatXX 

430 440 450 460 470 480 

CCTCCGTCCGTTTCGGGGGCTCCTGGTCAGCGGGTTACTATTTCT7CCCGCTCTTCCCAG 
FFSVSGAPGQRVTXSCR. SSQ 

PflM 

490 SOO 510 520 530 540 

TCTCTGGTCCATTCTAATCGTAACACTTACCTGAACTGGTACCAGCAACX6CCTGGTACG 

SLVHSNGHTYLHW YQQLPGT 
Z BstXX Kpnl 

550 560 570 580 590 60 0 

GCTCCGAAGCTTCTGATCTACAAAGTCTCTAACCGCTTCTCTGGTCTCCCCGATCGTTTC 
AP JCLLXYXVSHRFSGV PDRF 
HindZZZ 

610 620 630 640 650 660 

TCTGGTTCTGGTTCTGGTACTGACTTCACCCTCGCGATCACTCGTCTCCACGCCGAAGAC 
SGSGSGTOFTLAXTGLQAEO 

670 680 690 700 710 720 

GAGGCTGACTACTTCTGCTCTCAGACTACTCATGTACCGCCGACTTrTGGTGGTGGCACC 
£ A DY FCSQTTHVPPTFG GGT 

730 740 750 q 

AACCTCACGGTTCTGCCTTAACTGCAG C I Gl 1 A 

KLTVLR* LQ r ■ wi. 

Hpal PstX 
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GJ*rrcc;JterrcAACTcJLcA<rae^^ 

BFEV01.Q0SCPKI.VK P C A S V 
ECORI 

10 80 »0 100 110 110 

roCATCTCCTCOUUTCCTCTGCCTACACCWCACCAA^ 
RMSCKSSCYTFTHYYXH W X. K 

pi 

120 140 150 ltt 170 ijo 

CA6TCTCWGGTAA6TCT«A6AGT6CATCCG«^ 

Xfeax s** 1 

190 200 110 220 230 240 

AXGTACXXTGAOXACTTTXXAOCTXfcBOCGACCCTTACTCTCCXCXJATCTTCCTCAXCT 
KYNEHFXGKATX.T V 0 K S S S T 
Oral S * J * 

2*0 260 270 280 290 300 

CCTTACATCCACCTGCCrTCTrrGACCTCTGAGCACTCCCCGCTATACTA 

kvMrt-SSLTSEDSAVYYCAK 
A-YMEX.XSl.T>* SaelJ b^hjj 

j 10 220 330 340 350 340 

TACACTCATTATTACTTCGATTATTGGGGCCATGGC6CTAGC6WACCCT6ACCICT6CT 
VTHYYFDYWGHGASVTVSSO 

270 380 390 400 410 430 

66C6CTG6CTCCG6CG6TCGTG66TCGG6TGGCGGCG6ATCCGACGTCCTTATGACCCAG 

CGGSCCGGSGGGGSDVVBTw 

BaaHX AatXl 

420 440 450 460 470 480 

ACTCCGCTGTCTCTGCCGGTTTCXCTGGGTGACCACCCTICTATTTCrTGCCGCTCITCC 
'TPLSLPVSLGDOASXSCR55 

BstEXX 

490 500 510 920 530 540 

CAGTCTATCGTCCATTCTAATGGTAACACTTACCTCGAGTMTACCWCAAAAGGCIGCT 

QSJVKSHGMTYX.EW Y L 0 K A G 

BstXX v— ? ^ 

itpm 

550 360 570 MO P? ^ rr _- iJ£? 

CACTCTCCCAACCTTCT6ATCTACAAA6TCTCTAACC6CTTCTCA CC 1 Q 1 > CU^ATCCT 

QSFKtLIVKVSHRrSCVFOR 
HindXXX 

ei0 620 630 MO 6S0 660 

TrCTCTC U lTUUU m T C CTAC^ 
FSCSCSCTDPTL S R V E A E 

670 660 6*0 700 710 720 

GATCTGGCT ATCTACTACTC crPCCAACCCTCTCATCTACCW^CACnTCCCCCCTCCC 
0 L C I yycrQCSRVPWTFCCC 

730 740 750 Q _ 

ACCAAG CT CGAGATT AAA CGTTAACTGCAG C| G» I JD 

Xhol HpftZ P«tX 
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10 20 30 40 50 60 

CATCCCCACCTTATCCTCCTTCAATCTCCTCCAGTACTCATCCAXCCTCCTCCGTCCCTC 
DPEVMLVESGGVLMEPGCSX, 

Seal tCOQ 

70 80 90 100 110 120 

AACCTCACCrCTCCTCCTACCCCCTTOVCCTTCrrcrrCCTrACCCCATCTCTTC 
KLSCAASGFTFSRY AMSWVR 
EspX NUel PflMI 

130 140 ISO 160 170 180 

CACACTCCCGAGAACCCTCTACACTCCGTCCCCACCATATCTTCTCCTCGT T CTCACAOC 
Q T P £ KRLEWVATX 5SGG8HT 
N B8pKXX XbaX Nrul EcoRV 

190 200 210 220 230 240 

TTCCATCCAGACAGTGTGAAGGGTCGATrCACGATCTCTCGAGACAACCCTAAGAACACG 
FHPDSVXGRFTXSRDNAKNT 

XhoX 

250 260 270 280 290 300 

TTCTACCTGCAAATGTCTTCTCTACGTAGTG AAGATACTG CTATGTACTACTCTG CACGT 
tYLQMSS LRS EDTAMYYCAR 
BspMI* SnaBX ApaLI 

310 320 330 340 350 360 

CCTCCACTG ATCTCACTAGTTG CTGATTATG CCATGC ATTATTGGGGTCATGGTG CTAGC 
PPLI S L V A D YAM 0 Y W GHGAS 
SpeZ NcoX Nhel 

370 380 390 400 410 420 

GTTACTGTCAGCTCTGGTCCCGGTCGGTCGGGCGGTGGTGGCTCGGGTGGCGCCGGATCC 
VTVSSGGG GSG GGGSGGGGS 

$aeX 

430 440 450 460 470 480 

CATATCGTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTGTTGGTGACCGTGTTTCI 

DXVMTQSHKFMSTSVGDRVS 
EcoRV BstEXX 

« 

490 500 510 520 530 540 

ATOICTTGTAAGCCCACCCACGATGTGGGTGCTCCTATCGCATGGTATCAGCAGAAGCCC 
ITCKASQDVGAAXAWYQQKP 
PflMI S»a 

550 560 570 580 590 600 

GGGCAGTCTCCTAACCrGCTGATCTACTCGGCGTCCACTCCTCATACTGGTGTCCCCGAT 

GQSPXLXtXYWAStRHTGVPD 
X SaXX 

610 620 630 640 650 660 

CGTTTCACTGGGTCCGGATCAGGTACTGATTTCACTCTGACTATTTCGAACGTTCAGTCT 
RFTGSGSGTOPTXiTXSNVQS 
BspMXX AsuX; 

670 680 690 700 710 720 

GATGACCTGGCTGATTACTTCTGCCAGCAATATTCCGGGTACCCTCTGACTTTCGGTGCC 
D DLADYFCQQYSGYPLTFG.A 

SepI KpnX Nae 

730 740 750 r-i r Q N 

GGCACTAAACTCGAGCTGAACTAACTCCAG r J \J\. »D 

GTXLELK* 
X XhoX PstX 
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10 20 30 40 50 60 

CATCCCCACCTTATCCTCCTTCAATCTCCTCCACTACTCATCCAACCTCCTCCCTCCCTC 
O PEVMLVESGGVLMEPCCSL 

Seal EcoO 

70 80 90 100 110 120 

AAGCTGAGCTCTGCTGCTAGCGGCTTCACCTTCT 
KLSCAASGFTFSRY A M S W V R 

Espl Nhel 

130 140 150 160 170 180 

CAGACTCCGGAGAAGCGTCTAGACTGGGTCGCGACGATATCTTCTGGTGGTTCGAACACT 
OTPEKRLEWVATXSSCGSNT 
° B*pKXX Xbal Krul ECORV ABttXX 

190 200 210 220 230 240 

TACTATCCAGACACTGT6AAGCCTCCATTCACGATCTCTCGAGACAACGCTAAGAACACG 
YYPDSVXCRFTXSRDWAKNT 

XhoX 

250 260 270 280 290 300 

TTCTACCTGCAAATGTCTTCTCTACCTACTCAACATACTCCTATGTACTACTCTGCACGT 
LYLQMSSLRSEDTAMYYCAR 
BspMI* SnaBI ApatX 

310 320 330 340 350 360 

CCTCCACTGATCTCACTACTTG CTGATTATCCCATGGATTATTGGCCTCATGGTC CTAGC 
P P L X S LVADYAMDYWG HGAS 
SpeX HcoX Hhel 

370 380 390 400 410 420 

CTTACTGTGACCTCTGGTGGCGCTGGGTCGGGCGGTGGTGGCTCGGGTGGCCCCCGATCG 
VTVSSGGGGSGGGGSGGGGS 

SacX 

430 440 450 460 470 480 

GATATCGTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTGTTGGTGACCGTGTTTCT 

DXVMTQSH.XFMSTSVG DRVS 
ECORV BstEIX 

490 500 510 520 530 540 

ATCACTTGTAAGCCCAGCCAGGATCTGGGTGCTGCTATCCCATGGTATCAGCAGAAGCCC 
ITCXASQOVGAAXAWYQQKP 
PflKI Saa 

550 560 570 580 590 600 

GGCCAGTCTCCTAAGCTGCTGATCTACTGGGCGTCCACTCCTCATACTGGTGTCCCGGAT 

GQ5PKLLX YWASTRHTGVPO 
2 SalX 

610 620 630 640 650 660 

CGTTTCACTGGGTCCGCATCAGGTACTCAlTrCACTCTGACTATrTCCAACGTTCAGTCT 
RFT5S-GSGT DPTLTXSMVQS 
BspMXX AauIX 

670 660 690 700 710 720 

GATGACCTGGCTGATTACTTCTCCCAGCAATATTCCGGGTACCCTCTGACTTTCGGTGCC 
DDLADYFCQQYSCYPtTFGA 

SspX Xpnl Nae 

730 740 . 750 ri fl RE 

GGCACTAAACTCGACCTGAACTAACTGCAG • 

GTXLELK* 
I Xhol PstX 
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10 20 
Met Lys Ala 21a Phe Val Leu Lys Cly Ser Leu Asp Arg Asp Leu Asp Ser Arg Leu Asp 
ATC AAA OCA ATT TTC CTA CTC AAA OCT TCA CTC GAC ACA CAT CTC GAC TCT CCT CTC GAT 

Bglil 

30 40 
Leu Asp Val Arg Thr Asp His Lys Asp Leu Ser Asp His Leu Val Leu Val Asp Leu Ala 
CTC CAC CTT CCT ACC CAC CAC AAA GAC CTC TCT CAT CAC CTC CTT CTC CTC CAC CTC CCT 

Bell Sail 
SO 60 
Arg Asn Asp Leu Ala Arg Xle Val Thr Pro Gly Ser Arg Tyr Val Ala Asp Leu Clu Phe 
CCT AAC GAC CTC CCT CCT ATC CTT ACT CCC CCC TCT CCT TAC CTT CCC CAT CTC CAA TTC 

SnaX BcoRI 

asp FICn. »OA 

CAT wi. 



EcoRI 




FlGi. «o 6 



AflH 
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94- 

$ 43- 
h 29- 

* 20.1 - 
14.4- 



9 



0 12 3 4 5 

Fl Cn. 11 



DVOLQESGPGLVKPSOSLSLTCSVTGYSIT 
SGYFWNWXRQFPGNKLEWLGFIKYDGSNYG 
NPSLKNRVSXTRDTSENQFFLXLDSVTTAT 
YYCAGOHOHLYFDYWGOGTTLTVS 

GGGGSGGGGSGGGGS 



QAVVTQESALTTSPGGTVILTCRSSTGAVT 
TSNYANWI QEKPDHLFTGLIGGTSNRAPGV 
PVRFSGSLI GDKAALTITGAQTEDDAMYFC 
ALWFRNHFVFGGGTKVTVLG 



FIG. 9C 
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INHIBITOR CONCENTRATION f Mj 
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10 20 30 40 50 60 

GAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

EFMADNKFNKEQQNAFYEZL 
ECORZ M1UZ BglZX 

xmnl 

70 80 90 100 110 120 

CACCTG CCG AACCTG AACG AAG AG CAGCGTAACGG CTTCATCCAAAG CTTG AAGG ATG AG 
HL PNLNEEQ RHGFIQSLXOE 
BspMI+ HindZZZ 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCACGCACCC 
PS Q S AN L L A DAKX LM D A Q A P 

Nhel FspX 

190 200 210 220 230 240 

AAATCGG ATCAGG GG CAATTCATG GCTG ACAACAAATTCAACAAG G AACAG CAG AACGCG 
KSDQGQF MADNKFNKEQQN A 

Mlul 
Xmnl 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEILHLPNLNEEQRNGFIQ 
BglZX BspMZ* K 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCT6CTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSA NLLADAKKLN 
indZZZ Nhel 

370 380 Cl r id 

GATGCGCAGGCACCGAAATCGGATCC V I V3V . n 

DAQAPKSOP 
Fspl BanHZ 
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* I * 



(BABS) • 

10 20 30 40 50 60 70 

GGATCCGGTAACTCTCACTCTGAATGCCCGCTGAGCCACGATC 
GSGNSDSECPLSHDGYCLHDGVCMY 

BanHI 8*al+ E»pl 

85 95 X05 119 125 135 145 

ATCGAACCTCTCCACAAATACCCATGCAACTGCCTTGTACCCTACATCGCTCA6CCCTCCCACTATCCCCATCTC 

IEALDKYACHCVVGYICERCQYRDI- 
Sphl NruI 

AAATGGTGGCAGCTCCGTTAACTGCAG r»W» 
K W W E L R • 

Hpal PstI 
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(BABS)- 

10 20 30 40 30 60 

CCATCCCOTCGCCACCCCTCCAAGGACTCaUACCTCAGGlTTCTCCTCCCCAAGCTCCT 
GSGGDPSKDSKAQVSAAEAC 

BamHI 

70 80 90 100 HO 120 

ATCACTGGCACCTGGTATAACCAACTGGGGTCGACTTTCATTGTGACCGCTGGTCCGGAC 
ITGTWYMQtGSTFXVTAGAD 

Sail 

130 140 150 160 170 180 

GGAGCTCTGACTGGCACCTACGAATCTGCCGTTGGTAACGCAGAATCCCCCTACGTACTG 
GALTGTYESAVGNAESRYVL 
Sad SnaBI 

190 200 210 220 230 240 

ACTCGCCGTTATGACTCTGCACCTGCCACCGATGGCTCTCCTACCGCTCTGGGCTGGACT 
TGRYDSAPATDGSGTALGW.T 

BspMI+ KpnZ 

250 260 270 280 290 300 

GTGGCTTGCAAAAACAACTATCGTAATGCGCACAGCGCCACTACGTGGTCTGGCCAATAC 
VAWKNNYRHAHSATTWSGQY 

Fspl DralZX Ball 

FflMI BstXI 

310 320 330 340 350 360 

GTrGGCCGTCCTGAGCCTCGTATCAACACTCAGTGGCTGTTAACATCCGGCACTACCGAA 
VGGAEARXNTQWLLTSGTTE 

Drain Hpal 

370 380 390 400 410 420 

CCGAATCCATGGAAATCGACACTAGTAGGTCATCACACCTTI^CCAAAGTTAAGCCTTCT 
AHAWKSTLVCKOTPTKVK'PS 
Bsml+ Spcl 
Hsil 

430 440 450 460 470 480 

CCTGCTAGCATTGATGCTCCCAAGAAAGCAGGCCTAAACAACGGTAACCCTCTAGACGCT 
AASIDAAKKAGVNMGNPLDA 
Nhel BstEXI Xbal 

490 500 

GTTCAGCAATAACTGCAG c . , \ fi A 

v q q + ri\an# 1 w 

p»ti 
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( BABS ) - 

-in 20 30 40 50 60 

GGATCCGGTGTACGTAGCTCCTCTCCCACTCMTCCGATAAGCCGGWGCTCATGTACTT 

GSGVRSSSRTPSDKPVAnv 
BanHI SnaBI 

,„ 80 90 100 110 120 

ANPQAEGQLQWLMRRAii a u 
MatXI °* 
,, n 140 150 160 170 180 

gcaaacgJcSttgagctccg^^ 

ANGVELRDNQL V V P S E e t* * 
Sad PflMI Kpnl 

100 200 210 220 230 240 

ATCTATTCTCAAGTACTGTTCAAGGGTCAGGGCTGCCCGTCGACTCATGTOCTGCTGACT 

XYSQVLFKGQGC P S T n V I* 4» * 
Seal Sal1 

250 260 270 280 290 300 

CACACCATCAGCCGTATIGCTGTATCTTACCAGACCAAAGWAACCTCCTGAGCGCTATC 

HTISRIAVS, w HpalBspMI* EC047III 

Espl 

310 320 330 340 350 360 

AAGTCTCCGTGCCAGCGTGAAACTCCCGAGGGTGCAGAAGCGAAACCATGGTATGAACCG 

KS PCQRETPEGAEAK J^" x & r 

370 380 390 400 410 420 

ATCTACCTGGGTGGCGTATTTCAACTGGAGAAAGGTGACCGTCTGTCCGCAGAAATCAAC 

IYLGGVFQLEKGDRLSAEIW 

BstEII 

430 440 450 460 470 480 

CGTCCTGACTATCTAGATTTCGCTGAATCTGGCCAGGTGTACTTCGGTATTATCGCACTG 

RPDYLDFAESGQVYFGIIAI. 
Xbal Ball 



490 
TAACTGCAG 
* 

PstI 



flCn. »5C 
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(BABS) - 

10 20 30 40 50 60 

GGATCCGGTGCTGATCAGCTGACTGACGAGCAGATCGCTGAATTTAAAGAGGCTTTCTCT 
GSGADQLTDEQIAEFKEAFS 

BamHI BclIPvuZZ Oral 

70 80 90 100 110 120 

CTGTTTGACAAAGACGGTGACCGTACCATCACTACCAAAGAGCTCGGCACCGTTATGCGC 
LFDKDGDGTITTXELCTVMR 

Kpnl SacI Fspl 

130 140 150 160 170 180 

AGCCTTGGCCAGAACCCGACTGAAGCTGAATTGCAGGACATGATCAACGAAGTCGACGCT 
SLGQNPT EAELQDMINE VDA 
Ball hell Sail 

190 200 210 220 230 240 

GACGGTAACGGCACCATCGATTTTCCGGAATTTCTGAACCTGATGGCGCGCAAGATGAAA 
DGNGTI OFPEFLNLMARKMK 
Clal BspMII BSSHII 

250 260 270 280 290 300 

GACACTGACTCTGAAGAGGAACTGAAAGAGGCCTTCCGTGTTTTCGACAAAGACGGTAAC 
DTDSEEELKE AFRVFDKDGN 

StUl 

310 320 330 340 350 360 

GGTTTCATCTCGGCCGCTGAACTGCGTCACGTTATGACTAACCTGGGTGAAAAGCTTACT 
GFXSAAEL RHVM TNLCEKLT 
EagI HindIZZ 

370 380 390 400 410 420 

GACGAAGAAGTTGACGAAATGATTCGCGAAGCTGACGTCGATGGTGACGGCCAGGTTAAC 
DE EVDE M I READVDG DGQ VN 
Xnml Nrul Aatll Hpal 

430 440 450 

TACGAAGAGTTCGTTCAGGTTATGATGGCTAAGTAACTGCAG C | Ctl 1 5 D 
YEEFVQVMM AK* r w. 

PstI 
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( BABS ) • 

30 40 50 60 

GSCCCSLGSLTIA Bsia 
BaaHI 

ao 100 HO I 20 

TGCAAGAC^CTACCC^ 
CKTRTEVF BIS R * clftI Bs 

1+ 89J>AJ ' pvul 

,- 0 160 170 180 

xjcttcc^gcc^ 
txi 

... 920 230 240 

CGTAACGCTCAATGTCGACCGACTC^GTCCAGCTGCGT p y Q y R K X 

->-7ft 280 290 300 

!»? «ii J reftTCTTTAASAXGCCCACTGTTACTCTGGAAGACCATCTG 



SnaBI 

.^^GCGGCCGCACGTCCAGCTACTTAACTGCAG 



GCATGCAAATGTGAGACTGTAGCGGCCG-- p V T * 
A C K C E T V A A A R P ^ 
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(BAB5) - 

10 20 30 40 SO 60 

GGATCCGGTATATTCCCCAAACAATACCCAATTATAAACTTTACCACAGCGGGTGCCACT 

GSGXFPKQYPI INFTTAGAT 
BamHX 

70 00 90 100 110 120 

GTGCAAAGCTACACAAACTTTATCAGAGCTGTTCGCGGTCGTTTAACAACTGGAGCTGAT 
VQ SYTNFI RAVRGRLTTGA D 

130 140 150 160 170 180 

GTGAGACATGAAATACCAGTGTTGCCAAACAGAGTTGGTTTGCCTATAAACCAACGGTTT 
VRHE I PVLPNRVGLP I N Q R F 

190 200 210 220 230 240 

ATTTTAGTTGAACTCTCAAATCATGCAGAGCTTTCTGTTACATTAGCGCTGGATGTCACC 
1LVELSNHAELSVTLALDVT 

EC0471II 

250 260 270 280 290 300 

AATGCATATGTGGTCGGCTACCGTGCTGGAAATAGCGCATATTTCTTTCATCCTGACAAT 
NAYVVGYRAGNSAYFFHPDN 

Ndel 
Nsil 

310 320 330 340 350 360 

CAGGAAGATGCAGAAGCAATCACTCATCTTTTCACTGATGTTCAAAATCGATATACATTC 
QEDAEAZ THLFTDVQNRYTF 

Clal 

370 380 390 400 410 420 

GCCTTTGGTGGTAATTATGATAGACTTGAACAACTTGCTGGTAATCTGAGAGAAAATATC 
AFGG NYDRL EQLAGNLREN I 

430 440 450 460 470 480 

GAGTTGGGAAATGGTCCACTAGAGGAGGCTATCTCAGCGCTTTATTATTACAGTACTGGT 
ELGNGPLEEAISALYY YSTG 

EC047IIZ Seal 

490 500 510 520 530 540 

GGCACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAA 
GTQLPTLARSFIIC1Q.MIS E 

S50 560 570 580 590 600 

GCAGCAAGATTCCAATATATTGAGGGAGAAATGCGCACGAGAATTAGGTACAACCGGAGA 
AARFQYI EG EMRTR1RYNRR 

Fspl Bgl 
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(BABS)- 
BaoHI 

on 90 100 HO 120 

, \K0 160 170 1*0 

CGTATGCTGA^CAAATTCTACATCC^ k kaTELK HLQ 

210 220 230 240 

CLEEELKP^ scal 

,. fl 270 280 290 300 

TTCCACC^CGT^ 
F „ L R P R D L ci l S ^ 

330 340 350 360 

„. 3 g 0 400 410 420 

FlCn. 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTGCTGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

GSGADNKFNXEQQNAFYEIL 
BamHI MluZ BgllZ 

XnnZ 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGAG 
HLPNLNEEQRNGF1QSLKDE 
BspMI+ HindZZZ 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACCATGCGCAGGCACCG 
PSQSANLLADAKKLNDAQAP 

Nhel Fspl 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCG 
KS DQGQFMADNKFN KEQQNA 

Mlul 
Xmnl 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEZLHLPNLNEEQRNGFIQ 
BglZZ BspMI+ H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSA NLLADAKKLN 
indlll Nhel 



370 380 
G ATG CG CAGGCACCGAAATAACTG CAG 
D A Q A P K * 
Fspl . Pstl 



FIG,. '5H 



55 



